The rules in file "Scores.hst" determine, which articles will be loaded immediately, which will be logged in the Killfile-Log and which will be ignored completely when pulling new news.
This filtering is done with a technique called "scoring". In Hamster, this means, that each article starts with a score-value of zero and then gains or loses points if it matches one or more of the given score-rules.
The final score-value then determines, if the article will be loaded. If the value is greater than or equal to zero (>=0), it will be loaded immediately. Otherwise (<0) it will not be loaded but saved in the Killfile-Log. To keep this logfile small and clear, you can further set a score-limit, which prevents log entries for articles with a "very low" score-value.
ScoreFile = *( ScoreBlock / cEOL )
ScoreBlock = ScoreScope *( ScoreRule / cEOL )
ScoreScope = "[" ScopePattern *( 1*WSP ScopePattern ) "]" cEOL
ScopePattern = [ "+" / "-" ] Pattern
ScoreRule = ["?"] ["="] ScoreValue 1*WSP ScoreSelection cEOL
ScoreValue = ( "+" / "-" ) <Number>
ScoreSelection = ScoreDefField 1*( 1*WHSP ScorePattern )
ScorePattern = ["+"/"-"] [ "@" ScoreField ":" ] Pattern
ScoreDefField = [ "~" ] ScoreField
ScoreField = ( "Number" / "Subject" / "From" / "Date" /
"Message-ID" / "References" / "Bytes" /
"Lines" / "Xref" / "Xpost" / "Age" ) [":"]
Pattern = ( PatRegExp / PatSimple )
PatRegExp = "{" <PCRE-style regex-pattern> "}"
PatSimple = ( PatSimpleAll / PatSimpleText / PatSimpleNumber )
PatSimpleAll = "*"
PatSimpleText = """ <Text> """
PatSimpleNumber = "%" ( "<" / "=" / ">" ) <Number>
cEOL = [ "#" <Comment> ] CRLF
Each section starts with a "[...]"-header
describing the groupnames, for which the following score-lines should be tested:
[*]
# score-lines for all groups
[* -".announce"]
# score-lines for all groups except those containing ".announce"
["news" "usenet"]
# score-lines for all groups containing "news" or "usenet".
[{^news\.} {^alt\.usenet\.}]
# score-lines for all groups starting with "news." or "alt.usenet."
The patterns within "[...]"
follow the same rules as the "Score-Patterns" described below.
The score-value for a tested article is raised with "+"-
and lowered with "-"-values:
+100 subject "hamster" -100 subject "make money fast"
If a matching score-line is preceded with "=", the
score-value is set to the given value and no further tests will be made for this
article:
=+9999 from "my.mail@address" =-9999 from "spam.mail@address"
The scoreable fields depend on the overview-information returned by the
newsserver ("XOVER"). In most cases, the following fields
are available for scoring: Subject, From,
Date, Age, Message-ID,
References, Bytes, Lines,
Xref, Xpost:
+100 subject "hamster"
-100 from {no.*spam}
+500 message-id "my.unique.fqdn"
+100 references "my.unique.fqdn"
-100 bytes %>10000
-100 lines %>250
The fictitious header-field Xpost is based on Xref
and gives the number of groups, the article was crossposted to:
-10 xpost %>2 # posted to more than 2 groups =-9999 xpost %>5 # posted to more than 5 groups
The fictitious header-field Age is based on Date
and gives the age of the article in days:
=-9999 age %>14 # ignore all articles older than 14 days
If a fieldname is preceded with "~", the value
of the field is MIME-decoded before testing. If given, this decoding is also
done for any additional "@"-fields in this line:
+100 ~subject "hämstêr" -100 ~from "jürgen" +@subject:"hämstêr"
Before loading an article, only a reduced set of header lines can be tested (see previous section). After the article was loaded, additional rules can be applied without this restriction. As all headers are available now, all of them (and even the body part of the message) can be tested.
Rules, that should be tested after the article was loaded, have to be marked with a leading question mark:
?+42 User-Agent "hamster" ?=-9999 Path "!known.spamm.er!not-for-mail"
Certainly, such rules cannot decide any more, if the article should be loaded, as it was already loaded. Their purpose is to decide, what to do with the loaded article.
Again, the score value starts with 0 and can be raised or lowered with such
"?" rules. If the score value remains 0 or gets a value
greater than 0, the article is stored in the database. But if the final score
value becomes a value below 0, the article is not stored and is just dropped. So
the main purpose for these special "?" rules is to get
rid of really unwanted articles, that could not be detected by the normal rules
described in previous section.
Besides the names of any header lines, you can also use some special names in
"?" rules:
?+42 Header "hamster" ?+42 Body "hamster" ?+42 Article "hamster"
Header checks, if any header line matches the given
patterns, Body checks all body lines and Article
checks all header and all body lines.
Patterns without a leading "+"- or "-"-sign
mean, that one or more of them must match:
# "hamster" or "newsserver" or "mailserver" +1 subject "hamster" "newsserver" "mailserver"
Patterns with a leading "+"-sign mean, that the
field has to contain this value:
# "hamster" in combination with "newsserver" or "mailserver" +1 subject +"hamster" "newsserver" "mailserver"
Patterns with a leading "-"-sign mean, that the
field may not contain this value:
# "newsserver" or "mailserver" not regarding "unix/linux/inn" +1 subject "newsserver" "mailserver" -"unix" -"linux" -"inn" # From-headers not containing "@" =-9999 from -"@"
To combine different header-fields in a score-line, you can qualify the pattern with its name:
-1 subject "help" "urgent" "!!!" -@from:"my@address" -"SCNR"
If a score-pattern is placed within "{...}", it
is treated as a PCRE-style regular expression[*]:
# Ignore those who want to be ignored:
-1 from {no.?spam} {(remove|delete|cut).*this}
[*] Perl-documentation for regular expressions can be found at:
http://www.perl.com/CPAN-local/doc/manual/html/pod/perlre.html
# A section starting with "[*]" contains global score-entries, which will be # used for all groups: [*] # Load "my" articles immediately: =+9999 From "Your Name" =+9999 Message-ID your.unique.fqdn # Load articles referencing one of "my" articles: =+5000 References your.unique.fqdn # Certainly, we are very interested in articles regarding these funny little # animals with small antennas on the head: =+1000 Subject hamster "HELP! THERE'S A BIG FAT RAT ON MY SCREEN!" "SCNR ;-)" # And certainly, we ignore really silly suggestions such as: =-1000 Subject "MAKE HAMSTER FAST!!!!" # (please notice, that this entry would never match, as subjects containing # "hamster" would match the "="-entry above) # The examples below use group-specific score-entries by starting a new # section in the scorefile with a "[...]"-line. # As Hamster builds an "individual" scorelist for each group before loading # articles for it, it is more effective to define "individual" filters, if # score-entries are only needed for some of the groups. # Filter out "big" articles, that do not have "FAQ" in subject and are not # posted in an announce-group: [* -announce] -10 Lines %>200 -10 Bytes %>10000 +20 Subject FAQ # Ignore articles posted to more than three groups: -10 Xpost %>3 # Ignore articles with subjects containing "!!!" in all groups except the # newusers-groups: [* -newusers -neubenutzer] -1 Subject "!!!" # Some groups may be more readable, if you filter out all articles and # only load specific ones immediately, e.g.: [group.name.one group.name.two group.name.three] -1 Message-ID * +1 Subject "interest1" "interest2" "interest3" "interest4" +1 From "user1" "user2" "user3" "user4"