[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [curn-users] issue with IgnoreDuplicateTitles plug-in



On 8/5/07 9:20 PM, Bharath Prathipati wrote:
> Thanks Brian.. that was a prompt reply!
> Here's my config file:
> [var]
> # "feedDir" dumps to a directory that's accessible internally via URL
> feedDir: .
> # curnDir: where this file and the cache live
> curnDir: .
>
>
>
> [curn]
> CacheFile: ${var:curnDir}/common.cache
> MaxThreads: 15
> ParserClass: org.clapper.curn.parser.rome.RSSParserAdapter
> GzipDownload: true
>
>
> [Feed_slashdot]
> # Slashdot
> URL: http://rss.slashdot.org/Slashdot/slashdot
> SaveAs: ${var:feedDir}/slashdot.xml
> IgnoreDuplicateTitles: true
>
> And the command line:
> (to enable logging..)
>
> set
> CURN_JAVA_VM_ARGS=-Djava.util.logging.config.file=./logging.properties

Okay, well there are several problems here:

1. You're passing an argument for the JDK logging facility, but your
   logging.properties file is for the Log4J logging facility.
2. In any case, you're missing the argument to the Apache Commons Logging
   layer that tells it what underlying logging layer to use.
3. You have to use a URL, not a path, for the Log4J configuration file.

Since I don't know where "." is in your example, I'll make something up and
pretend that it's "C:\bharath\curn". Here's what you really want:

set
CURN_JAVA_VM_ARGS=-Dorg.apache.commons.logging.log=org.apache.commons.logging.impl.Log4JLogger
-Dlog4j.configuration=file:/c://bharath/curn/logging.properties

> (to invoke curn)
>
> curn --logging -C curn.cfg
> and this is the "logging.properties" (in case you want to have a look at it)
> log4j.rootLogger=debug, File
> log4j.appender.File=org.apache.log4j.FileAppender
> log4j.appender.File.layout=org.apache.log4j.PatternLayout
> log4j.appender.File.file=./log.out
> # Overwrite the file each time
> log4j.appender.File.append=false
> # Print the date in ISO 8601 format
> log4j.appender.File.layout.ConversionPattern=%d %-5p (%c{1}): %m%n
> log4j.logger.org.clapper.curn=debug
>
> The issue here is not to enable logging .. though J .. but to get the
> duplicate plugin working.

Except that the logs can help me...

> And also I have a question about IgnoreDuplicateArticlesPlugIn.java'. I
> suppose this is the underlying class that's called when
> IgnoreDuplicateTitles is set to true. I somehow forced articles with
> duplicate titles to see how they're handled. They have exactly the same
> title. But curn couldn’t suppress it. I was hoping to modify that plugin
> to make it more sophisticated, if this works well.

Yes, that's the plug-in that implements IgnoreDuplicateTitles.

I pulled down the Slashdot feed that you configured. There are no duplicate
titles in that feed, I'm afraid.

In any case, I hacked the downloaded file to add a duplicate entry. When I
run curn without the "IgnoreDuplicateTitles: true" configuration
parameter (in the Feed section), I get two copies of that article. When I
add:

        IgnoreDuplicateTitles: true

to the Feed section, I only get one copy of the article.

I need some more clarification of your problem.
-- 
-Brian

Brian Clapper, http://www.clapper.org/bmc/
Whenever anyone says, "theoretically", they really mean, "not really".
	-- Dave Parnas
---
*** Posted to the curn-users mailing list (curn-users@xxxxxxxxxxx).



 Back to curn-users archive.