[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [curn-users] issue with IgnoreDuplicateTitles plug-in
On 8/5/07 8:34 PM, Bharath Prathipati wrote:
> hi there!
>
> I was trying out the windows version of curn and was trying this
> IgnoreDuplicateTitles plugin, but could never get that to work. And even
> my attempt to enable logging was not successful.
>
> For the plugin.. I used “IgnoreDuplicateTitles: true” for each feed. Is
> there something else to be done?
>
> May be my question/request is too abstract! Please let me know if
> anything else is to be provided.
Bharath,
Logging does work, but getting it configured properly can be a challenge
the first time you try it. (That has a lot to do with how the underlying
logging APIs work.)
Please send me:
a) Your curn configuration file.
b) The command line you used to invoke curn.
Note that the IgnoreDuplicateArticles plug-in is rather simplistic. It
simply compares the article titles in each feed to see if there are
duplicates. It attempts to normalize the titles slightly, but only
slightly:
- It converts all adjacent white space into a single space.
- It converts the title to lower case.
Thus, these two titles will compare as equal (and the second one will be
suppressed:
Dog drags owner from well
Dog drags Owner from well
The first one will be converted to "dog drags owner from well" and saved.
When curn sees the second title, it will remove the extra spaces and convert
it to lower case; the result will match the first title, and the second
article will be suppressed.
It doesn't do anything fancier than that, though.
Send me your config file. I'll take a look.
--
-Brian
Brian Clapper, http://www.clapper.org/bmc/
A day without sunshine is like night.
---
*** Posted to the curn-users mailing list (curn-users@xxxxxxxxxxx).
Back to curn-users archive.