»  Home  »  Blogs  »  Surprise, FastFind NX is coming (very) soon !

Surprise, FastFind NX is coming (very) soon !

Chris and I came up with an idea for a way to increase the crawling speed of FastFind that I started playing around with at home. While I was doing that I came up with a few different ideas for features and how I could implement them into FastFind so I started coding them up and realised that they were significant enough that it was worth doing a new release for. I think they will make a big difference to FastFind, hopefully you will too.

Out of this list the only thing left to do is to finish off the PDF parser but after that there will be lots of testing and polish but I am hoping to at least have a beta out in about a month so it's actually pretty close to being done.

  • New control panel interface bringing it in line with our other NX products
  • Improved crawler so that it only crawls the site once. The previous crawler would crawl a site twice, once for links and a second time for content. This would allow it to display a progress bar. FastFind NX will only do a crawl once and use the previous amount of pages as a guide for displaying the progress bar. As a result the progress bar may not seem to behave correctly when you have added / removed a large amount of pages to your site, however crawls should be at least roughly twice as fast as the previous version.
  • Improved crawler that can greatly reduce the amount of content transferred and the speed of a crawl. If you have the php-curl module installed then FastFind can now detect if a page has changed since the last crawl (assuming your server returns the correct headers) and if it hasn't it can skip transferring the file again. This will be especially beneficial for sites with large amounts of static content.
  • Modular Parser System

    A FastFind NX parser is a way to extract some content from a file that you want to be able to search.The new parser system allows you to create a simple php object which extends a base Parser class and you just have to implement the Parse() function. Then you can assign it using the FastFind NX web interface to any url/file extension you like. FastFind NX will ship with the following parsers
    • Default html parser
    • New PDF parser 

    Other potential parsers could include
    • A parser to extract the exif information from photos (I will probably create this one as a way to document the process of writing the parser)
    • A parser to extract the ID3 information from MP3's
    • A parser to extract the text from word documents
    • A parser to allow you to pass the content to an external application to extract the text for you
    • Get a list of filenames from a zip file
    • And many more

  • Modular Reporting System
    The new reporting system allows you to write your own reports by creating a file in the reports directory with 2 functions, Intro() which defines a welcome page which explains what the report is going to do and takes any input required from the user and Report() which generates the actual report. FastFind NX will ship with at least the following reports
    • Broken Links Report
    • Page Rank Report

    Other potential reports could include
    • Keyword density
    • Check for spelling errors on pages
    • Meta Keyword phrase generator
    • Sitemap generators
    • And many more
  • Search Weighting system

    Control the ordering of your search results by applying weights (importance) to various parts of each page. Using this you can greatly improve the relevance of your search result ordering. The metrics that you can adjust for this are
    • Keyword Score
    • If the keyword is the page name
    • If the keyword is in the page name
    • If the keyword is in the url
    • If the keyword is in the title
    • The number of internal backlinks to the page
    • The length of the url
    • The depth of the url from the intial crawl page
    • The size of the page content
  • Ability to honour robots.txt file which should make getting started with FastFind on an existing site even easier
  • Ability to clear the search stats from the settings page
  • Ability to show a new icon next to pages that have been added recently (length of time a page is considered recent is configured on the settings page). If all pages are recent it won't show the new icon.
  • Ability to restrict a crawl to not go higher than the starting path specified on the settings page allowing FastFind to properly index only a section of your site.
  • Ability for us to ship with different languages (only includes English at present though)
If you have any other ideas for features that I might be able to slip into the release, just reply to this blog post.

8 Responses to "Surprise, FastFind NX is coming (very) soon !"


 
Chris Boulton
said this on 05 Aug 2007 12:26:33 AM CDT
Very much looking forward to these changes - and glad to see you've got the PDF support implemented as it means I no longer have to implement it myself!

Chris

 
Johnny
said this on 07 Aug 2007 8:46:16 AM CDT
I've still got a copy of FastFind from when it was available for free. It's a nice little script but I found it somewhat limited in application. I'm sure the NX version will be greatly improved and I'm looking forward to it.

Would you ever consider releasing a version of FastFind that allows searching of multiple websites...aka a Google like search engine.

[Rodney's Reply: I thought about adding that in for this version but decided that I'd rather get the ability of FastFind for one site up before tackling the multiple website problem ... but it is definitely something I have in the back of my head to try and include into FastFind.]

 
Cindy
said this on 12 Aug 2007 11:34:34 PM CDT
You all seem like such brains that my request sounds very silly to me . . . but any chance the search results page can be more friendly as far as customizing it? I would like to have the search results page be a specific width (haven't been able to achieve this yet) and maybe even have the results appear differently. And maybe even have the results land in a template page that looks just like the rest of my Website . . .

Hmmm . . . women always want to change a good thing! 8)

[Rodney's Reply: That sounds like something that you should be able to achieve with the current version. If you send through a support ticket from the client area I can help you with at least some of this.]

 
jb5ep
said this on 03 Aug 2007 1:19:26 PM CDT
Looks great, Rodney. One small request - any chance the default themes/templates could be XHTML1 strict? Just makes life a bit easier when trying to customise stuff....

Cheers.
[Rodneys Reply: I'll see what I can do. If I can't get it exactly compliant I will at least get it much closer then the current one to make things a little easier]

 
Brian
said this on 09 Aug 2007 4:59:30 PM CDT
I would like to second Johnny's idea.
This is not a product that most of us would actually use to create a Google, but some of us have several related sites, and the multiple site capability would be very very helpful. I am afraid that if you add the multiple site capability, you might give a Google-like price tag. Maybe you could limit it to 100 sites, and charge a higher price for those wanting to take on Google ... LOL.

 
david
said this on 03 Aug 2007 10:25:32 AM CDT
if you do sitemap generator, make sure it can create a google map(xml) or yahoo map

[Rodney's Reply: The sitemap report probably won't be with the initial release but the way things have changed will make it easier to implement]

 
David
said this on 03 Aug 2007 10:26:31 AM CDT
For sites suing adsense, you want to make an option feature that goolge search could be turned on. so search site or search web

 
Godfather
said this on 03 Aug 2007 3:31:42 PM CDT
Rodney, awesome news--thanks for sharing! I am quite excited to see this version released, as it looks like it is going to be a leap forward.



Leave a reply:
Your Name *: Email (private) *: Website:
Please copy the characters from the image below into the text field below. Doing this helps us prevent automated submissions.
Security Code: img