From version 1.0.3 to version 1.1 beta 1, DocFetcher was completely rewritten. This page gives an overview of the most important changes that were introduced in the rewrite.
General
- Support for 64-bit operating systems
- Support for Mac OS X
- Reworked manual; it now includes more information on advanced topics such as query syntax and regular expressions
- Faster startup, since indexes are now loaded one by one in a separate process
- Reworked structure of DocFetcher folder and configuration files
- Multiple instances of DocFetcher can now be run simultaneously without problems. More precisely, when adding/modifying/removing indexes in one program instance, all other running instances will automatically reload their indexes from disk.
- The ability to create "temporary indexes" was dropped
- Windows Explorer integration: Earlier versions of DocFetcher installed a "Search With DocFetcher" entry in the Windows Explorer context menu. This feature has been dropped. As a replacement, you can now paste files from the clipboard into the search scope area.
Indexing
- Archive indexing: DocFetcher can now traverse archives. The following archive formats are supported: zip and zip-derived formats, 7z, rar, SFX zip, SFX 7z
- Outlook PST file indexing
- Indexing of and searching in filenames
- Pausing and resuming indexing; partial indexes can be searched in
- NTFS junctions and infinite loops: NTFS junctions are now ignored during indexing. Not doing so caused earlier versions to get stuck in an infinite loop when encountering a circular directory structure. See bug report.
- Improved RTF parser (now includes Unicode support)
- Encoding detection for plain text files
- Mime-type detection (i.e. detection of file type by content)
- Extended options to exclude files from indexing
- Relative paths now work for files outside the DocFetcher folder, so you can move around the portable edition of DocFetcher without having to put all documents in the DocFetcher folder
- Page-wise indexing feedback for PDF files (i.e. while parsing a PDF file, the current page number and the total number of pages are displayed)
- Indexing can now be cancelled in the middle of the processing of a (potentially large) PDF file; in earlier versions the indexing stopped only after the processing of the PDF file was finished.
- Ability to index in the background while searching; the indexing dialog can be minimized to the status bar
- Fixed certain seemingly random crashes that occurred in the previous versions during indexing (multi-threading issues)
- The file list that shows the files indexed so far is now updated more efficiently in order to avoid slowing down indexing
- New rectangular style for tabs on the indexing dialog
- During indexing, more low-level error messages are shown
- Smarter skipping of unparsable files: DocFetcher now remembers which files it has failed to index the last time, so it can avoid wasting time on unsuccessfully trying to index them again during subsequent runs. (However, DocFetcher will try to parse them again when they are modified.)
- Folder watching can now be enabled or disabled separately for each index.
Searching, results display and preview
- Filter controls on the left side of the GUI can be collapsed
- 'Search' button
- Result table: No paging, i.e. DocFetcher does not divide the results into pages anymore; instead, the results are now always displayed on a single page. This works smoothly even for 100,000 files or more due to a trick called "virtual tables": Table items will not be loaded until they become visible to the user for the first time.
- Ability to change the preview font in the preferences
- Page-wise loading of PDF files into the preview pane. This led to an overall improvement in the responsiveness of the preview pane.
- Phrase highlighting: Improved highlighting for phrase search matches in the preview pane. For example, a phrase such as "the quick brown fox" is now highlighted as a whole, not as individual words.
Minor changes and bugfixes
- Fixed wrong info about wildcards in manual: The DocFetcher 1.0.3 manual stated incorrectly that you were not allowed to enter queries that start with a wildcard (i.e., ?oogle or *icrosoft). You can in fact do so, although such queries tend to be a bit slower on average.
- Overriding built-in parsers: Now a warning message is shown if the user has entered custom file extensions on the indexing dialog that would override the built-in parsers. (Some people mistakenly thought they were supposed to enter 'doc', 'odt', etc. in the text extensions field.)
- MS Office template files: DocFetcher now supports MS Office template files. The following file extensions were added: dot, xlt, pps, dotx, xltx, ppsx.
- Bug in DocFetcher.bat: The DocFetcher.bat in version 1.0.3 did not work because it contained a Unix line separator.
- Empty search history entry: Under certain circumstances, an empty entry was stored in the search history.
- Indexing dialog tab colors: The colors of the tabs on the indexing dialog weren't updated after OS theme changes.