25/8/2002 Must cross-check all virus parsers from stable tree. JKF: They *should* be fine, but please do double-check. Must check all Log::* calls to ensure they don't suffer from % bug. JKF: Already done. I have been careful not to use any variables in Log::* calls and do them all with %s or %d specifiers. Double-check IsSpam code. Try to re-write it if possible. It probably doesn't work. JKF: Has been rewritten. It's now a very simple check using the config file. Re-write all timeout code so it explicitly calls fork(). Should be more reliable. JKF: Not done. Had trouble getting replacement code to work, so have stayed with the current implementation as we know it works okay. Killing of dead children should be better now though. Filename rules will match when filenames are in some foreign character sets. JKF: for "match" read "not match". Now all filename rules have an extra bit automatically added on the end so that they all work in other character set encodings. Postmaster warnings now go to "Notice Recipients". JKF: Indeed. !!!Data Section definitions Config file variables should all be in a __DATA__ section read in, not directly defined in a set of hashes. JKF: Done Properties we need: * Name of variable * string representations of options available * value stored for each option For options with a yes/no response, the string will be "name no 0 yes 1" but it could equally be "name store store delete delete" if that's what is needed. This works for values with few options. Should probably allow "name store store delete delete forward forward" to work. Allow the "storage values" to contain the whole of the rest of the line with the keyword just forming the first word of the line. That way, a line such as "From: match@me.com forward user@address" will store "forward user@address" as a possible value in the set of results. But this is only relevant for "union of all values" keywords. It isn't needed for "0/1" values. This covers a lot, but not everything. Start different sections with /^\[section-name\]/ lines. Need a "no rules allowed" section for keywords which mustn't point at a filename. Call this __[Simple]__. Need a "yes/no" section for keywords matching From: then any To:. Call this section __[YesNo]__. Then need a "union of all matching rules" section for keywords where we need to work out the entire set of possible actions, checking every single rule. Call this __[Union]__. Add more as necessary. Don't know how to store these yet... :-) Thoughts whilst 'getting to know' v4: ------------------------------------ * MailScanner::Config::Override is confusingly named (how about something like "Default"?) JKF: Agreed, I'll change it. * SMDiskStore is used in Message.pm, shouldn't be. JKF: But only in Message::new. What should I have done? :) * Would it be better to return from MessageBatch::new with an empty batch rather than keep looping looking for one, and iterate at the higher level? JKF: 6 of one, half a dozen of the other. Current implementation is told to go and fetch a batch, and doesn't return until it has got one. * How do we allow for the configuration of things like utility paths (cat, sed...) if they are not using autoconf? Maybe a small require'd file that specifies defaults, which is the one in which autoconf can make the changes... JKF: The alternative is to make them real config variables and use MailScanner::Config::Value('rm'). That way they never have to touch anything that even *looks* like code. NWP: I was thinking of ways to avoid making them config variables, to avoid having too many config variables (and too many things for clueless people to break). If you're happy with them as config variables, fine. I do think that they can justifiably be "special", though -- any prepackaged version of mailscanner will be for a particular OS, and "know" where they should be. Any non-prepackaged version will use autoconf to locate them (and allow the locations to be specified to ./configure if desired), so whatever the case, there is no need to have them too easily changeable. * The concepts of DFileName etc. need to be *completely* encapsulated within the mta-specific module(s), to allow for different paradigms in different MTAs (e.g. postfix only has the one file) -- currently they are overexposed/overused. JKF: Feel free to change that, you're more practised in that sort of thing than me. NWP: OK. * ReadQf shouldn't really be passed the whole message object, just references to the bits it's expected to change. JKF: If you like. I don't want it to have a huge list of parameters though. NWP: Yeah. Half of me says fine, do it by passing the object for that reason. The other half says that it will become very confusing trying to remember when you are supposed to be playing with the object's guts directly, and when not. * Each module should use Exporter & define its interface by what it is willing to export. The bits that it exports *must* then be tightly specified. JKF: Do we have to? Sounds awful strict... NWP: It's always possible to cheat, but it will make life *much* easier in the long term if we keep the interfaces between modules clean and clearly specified. It'll also make it easier for us to palm things like AV support and MTA support off to other people. * You shouldn't ever modify a variable within a module's namespace if the module doesn't allow it to be exported. Although there's nothing to stop you if you *really* need to, needing to do so usually just shows you've designed the module wrong. * Some people think you should never touch any of a class/object's attributes directly, but that it should provide functions to manipulate those that need to be manipulated (and those functions will enforce any rules about what is and isn't supposed to be in the attribute). We almost certainly *will* disagree with this (for speed), but is still worth bearing in mind (especially when there are restrictions on what you want in the attribute, like for example whether or not the message headers have trailing newlines). JKF: Yes, and if I liked those people I would have written it all in a language that imposed lots of rules on me. In perl using loads of access functions adds real run-time overhead, I don't want it bogging down in thousands of function calls that replace 1 line of "straight" code. NWP: :) I'll just point out any that seem particularly worthwhile if + when I come across them, then... * I'm not clear about what should really go into $message->{headers} and $message->{qflines}, and why. JKF: qflines is intended as the breeding ground for the new message metadata file. It is only used in Sendmail.pm. As Exim has a metadata file for each message, can you not use it for similar things? I admit its contents will be completely different, but you will still need something along the same lines. Even Postfix will need its metadata working out before the new message file is created. headers is an easy to analyse read-only copy of the message headers, not the envelope metadata. Makes grabbing information from the incoming headers very quick and easy as it's an array, not the single string it was in V3. NWP: OK, I think I get the picture. Basically some/any kind of data structure that contains as much of the metadata as practical? Can we call it "metadata" please? Actually, looking a bit further, it looks like the qflines and headers are put back together again to recreate a deliverable message. This leads me to wonder whether we should really be splitting it out at all. What exactly does get put in an H line in a sendmail qf file? Anything that's not envelope data. TAB is header continuation. *** More questions etc. Why "$quiet" in openlock? Surely we either expect to fail to open the file sometimes, in which case we shouldn't make too much noise about it, or it's unexpected and a problem, in which case we should??