Idea for CharInSet

By | January 3, 2009

As a Delphi 2009 user you may have already found out that you must replace “Ch in [‘A’..’Z’]” constructs by the new inlined CharInSet function. Unfortunately inline functions aren’t handled like C++ inline functions. The Delphi compiler doesn’t replace the parameters by their constant/literal values but takes them as real parameters. This makes the “in” operator less efficient because it doesn’t see that you have a constant Set. It only sees a Set-parameter and then it uses the slower memory accessing assembler statements instead of the faster arithmetic statements.

This is no problem in normal code but if you have tied loops that e.g. iterate over a string, the loop slows down a lot. That’s why the “Ch in [‘A’..’Z’] in the SysUtils.LowerCase(string) function uses the case-of statement instead of the in-operator.

But now to the idea that I just had. With my DLangExtensions plugin it would be possible to override the CharInSet function with a macro that makes the CharInSet as fast as it was before. This way you would still get the “WideChar to ByteChar” warnings if you don’t use CharInSet but you also get the speed of the in-operator with constant Sets if you use CharInSet. And if somebody doesn’t have the DLangExtensions plugin the code would still compile but produces less optimized code.

6 thoughts on “Idea for CharInSet

  1. Thomas Mueller

    I’d prefer to continue using the
    c in […]
    syntax. Wouldn’t it be possible to create an extension that converts this construct on the fly? The added benefit is that no conversion is necessary.

  2. Loïs Bégué

    To be “in” or not to be “in”…

    1) the main purpose of a test like “c in […]” is/was to avoid “wrong” characters to be entered by a user.
    “wrong” could be equal to:
    – only one of the 26 letters of the english alphab.
    – no special chars like the one with code below 32.
    – … and so on …
    If our applications (and our databases) are to become “unicode” aware, they should allow “any char” (!!!) to be used as an entry.
    Only a few tests should remain necessary for the purpose of backward compatibility with older systems.

    2) A test like “c in […]” will mostly be run just after a user input. It happens only one by one and do not take much time.

    => IMO, a call to “CharInSet(…)” shouldn’t be a drawback in most of the situations.

  3. Andreas Hausladen Post author

    > the main purpose of a test like “c in […]” is/was to
    > avoid “wrong” characters to be entered by a user.

    Maybe in your case. But because I develop for the German market, I had to check for additional chars either. And I always used the CharIsAlpha/CharIsAlphaNum/… functions from the JCL for this.
    As a component, library/framework and IDE plugin developer I use the in-operator for code parsing and string handling. And that is where the slow CharInSet is at its worse.

  4. Loïs Bégué

    Hi Andreas,
    I’ve got your point, though. I develop for the german market too (albeit I’m french) 🙂

    Hmmm … Already tried the following?
    1) declare an Array [255] of boolean
    2) initialize it once at programm start with “True” for all allowed characters.
    3) check the input character like this
    If AllowedCharsArray[Ord(c)] then
    everything’s fine
    else
    something went wrong;
    Pfiadi from Munich 😉

  5. Andreas Hausladen Post author

    That’s exactly what JclStrings.CharIsAlpha do if compiled with Delphi 5-2007. In Delphi 2009 it uses the Unicode character information.

  6. Loïs Bégué

    Didn’t though about checking the JCL code.
    I’ve got the same idea as the JCL authors?? Really ???
    I should join the JEDI team, maybe… :))

Comments are closed.