The last post mentioned other groups' projects with dictionaries and language resources. I thought I'd mention some of the projects I've been working on in this area.
I've been creating build scripts with the LM BLD project ( http://www.distasis.com/cpp/lmbld.htm ) so that I'll have automated, repeatable steps to build programs, libraries and other types of packages. Here are some of things I've been working on.
The Moby project is a very nice dictionary resource. Using their thesaurus, I was able to create a word list and a simple dictionary in stardict format. I use it with Open Source programs like scramble.
The Strongs concordance is in the public domain. I've created a translation dictionary in stardict format with it.
I happen to like the stardict dictionary format. There are several nice programs that can work with that format. I wanted something lightweight that would work well on older systems or let me create my own GUI interfaces. The closest thing I could find to what I wanted was sdcv. However, there were a few issues I had with it. The biggest is that it requires glib as a dependency and I didn't want to install GTK+ related dependencies on my systems. The second issue I ran into was that it couldn't handle some of the newer versions of the stardict format. Since the code is GNU GPL licensed, I started with it and made several modifications and customizations. The result is sdcv2 which can be linked to my own Unicode shared libraries in place of glib if desired and can work with dictionaries in more recent stardict formats. It may not make use of all the latest features in the newer formats, but it can at least access information from them.
I've seen other projects that use the sdcv library as a back-end and create their own GUI for a dictionary program. It makes sense if the program uses GTK+, but it seems awkward for Qt or other GUI programs to require GTK+ related dependencies. With sdcv2, there are no GTK+ related dependencies.
I would love to find a dictionary with a FLTK GUI, especially if it can handle stardict format. Since, I haven't been able to find one, I may try to write one at some point. I've also been thinking about creating a pdcurses front end. When I use sdcv (or sdcv2) from the command line, certain systems like Windows can't handle input or output of certain Unicode characters correctly. I've added support for SDL 2.x, SDL2_ttf and the ability to work with a range of Unicode characters within the UCS-2 character set to pdcurses. I think pdcurses would make an interesting front end for a program using the sdcv2 library. It would work on any system that supports SDL 1.x or 2.x, including more unusual operating systems like Syllable and Haiku. Would like to hear from others who may be interested in or are working on similar projects.
The dictzip program compresses dictionary files. It uses an extension to the gzip format with extra fields to include information about the compressed dictionary. Files compressed with this format often use the .dz extension. You can use dictzip with stardict files to save space. dictzip is primarily a POSIX compliant program, so it doesn't convert well to certain systems. I was able to find a Windows port that limited the program's functionality, but did enough to get the job of compression done. I've made some modifications to it and am using it as a cross-platform method of compressing stardict dictionary files.
Several utilities and conversion programs were created for stardict in the stardict-tools project. Similar to stardict and sdcv, glib is a dependency for stardict-tools. There are a few tools that use a GTK+ front-end as well. I personally only use the stardict-tools to convert tab delimited files and files in babylon format to stardict. So, I modified the command line tools that do those conversions to build without glib. I also created my own makefile just to build the tools I use.
I've searched and I've yet to find a rhyming dictionary in stardict format. So, I'm working on creating one. It's a slow process. I've taken a public domain rhyming dictionary as a starting point and I'm in the process of editing it and converting it to the format I need.
I've also been searching for an Open Source C/C++ grammer checker, but I've yet to find one that I like.
These are just some of the projects I'm working on. If you're interested in comparing notes on these topics or if you have recommendations of other dictionary and word related projects you like, feel free to contact me ( http://www.distasis.com/contact.htm ).
I've been creating build scripts with the LM BLD project ( http://www.distasis.com/cpp/lmbld.htm ) so that I'll have automated, repeatable steps to build programs, libraries and other types of packages. Here are some of things I've been working on.
The Moby project is a very nice dictionary resource. Using their thesaurus, I was able to create a word list and a simple dictionary in stardict format. I use it with Open Source programs like scramble.
The Strongs concordance is in the public domain. I've created a translation dictionary in stardict format with it.
I happen to like the stardict dictionary format. There are several nice programs that can work with that format. I wanted something lightweight that would work well on older systems or let me create my own GUI interfaces. The closest thing I could find to what I wanted was sdcv. However, there were a few issues I had with it. The biggest is that it requires glib as a dependency and I didn't want to install GTK+ related dependencies on my systems. The second issue I ran into was that it couldn't handle some of the newer versions of the stardict format. Since the code is GNU GPL licensed, I started with it and made several modifications and customizations. The result is sdcv2 which can be linked to my own Unicode shared libraries in place of glib if desired and can work with dictionaries in more recent stardict formats. It may not make use of all the latest features in the newer formats, but it can at least access information from them.
I've seen other projects that use the sdcv library as a back-end and create their own GUI for a dictionary program. It makes sense if the program uses GTK+, but it seems awkward for Qt or other GUI programs to require GTK+ related dependencies. With sdcv2, there are no GTK+ related dependencies.
I would love to find a dictionary with a FLTK GUI, especially if it can handle stardict format. Since, I haven't been able to find one, I may try to write one at some point. I've also been thinking about creating a pdcurses front end. When I use sdcv (or sdcv2) from the command line, certain systems like Windows can't handle input or output of certain Unicode characters correctly. I've added support for SDL 2.x, SDL2_ttf and the ability to work with a range of Unicode characters within the UCS-2 character set to pdcurses. I think pdcurses would make an interesting front end for a program using the sdcv2 library. It would work on any system that supports SDL 1.x or 2.x, including more unusual operating systems like Syllable and Haiku. Would like to hear from others who may be interested in or are working on similar projects.
The dictzip program compresses dictionary files. It uses an extension to the gzip format with extra fields to include information about the compressed dictionary. Files compressed with this format often use the .dz extension. You can use dictzip with stardict files to save space. dictzip is primarily a POSIX compliant program, so it doesn't convert well to certain systems. I was able to find a Windows port that limited the program's functionality, but did enough to get the job of compression done. I've made some modifications to it and am using it as a cross-platform method of compressing stardict dictionary files.
Several utilities and conversion programs were created for stardict in the stardict-tools project. Similar to stardict and sdcv, glib is a dependency for stardict-tools. There are a few tools that use a GTK+ front-end as well. I personally only use the stardict-tools to convert tab delimited files and files in babylon format to stardict. So, I modified the command line tools that do those conversions to build without glib. I also created my own makefile just to build the tools I use.
I've searched and I've yet to find a rhyming dictionary in stardict format. So, I'm working on creating one. It's a slow process. I've taken a public domain rhyming dictionary as a starting point and I'm in the process of editing it and converting it to the format I need.
I've also been searching for an Open Source C/C++ grammer checker, but I've yet to find one that I like.
These are just some of the projects I'm working on. If you're interested in comparing notes on these topics or if you have recommendations of other dictionary and word related projects you like, feel free to contact me ( http://www.distasis.com/contact.htm ).