The new IDE Fix Pack 6.3 adds some IDE fixes and makes DFM reading a little bit faster due to the usage of buffers on the stack instead of the TBytes heap allocations with unnecessary FillChar calls that were introduced with XE3.
The compiler option extension -x-fpr also got a new feature. If a function has stack variables that exceed 4096 bytes, the function prolog that is generated will be up to 3 times faster than the original compiler generated code (RSP-19826).
Furthermore the compiler option extensions are now combined into -x-On options what makes it easier to specify the options because you don’t need to remember all the option names anymore.
-x-O1 Enable options -x-fvs -x-fpr
-x-O2 Enable options -x-fvs -x-fpr -x-orc
-x-O3 Enable options -x-fvs=2 -x-fpr -x-orc
-x-Ox Enable ABI changing optimizations: -x-fvs=2 -x-fpr -x-orc=2 -x-ff
All options are listed in the Readme.txt and can be specified under Project Options/Delphi Compiler/Compiling/Other options/Additional options to pass to the compiler. (Screenshot)
- Added: Patch to remove IDE flickering when WM_SETTINGCHANGE is broadcasted
- Added: Fix for RSP-20700: Tooltip Help Insight is blinking if Structure View is scrolled
- Added: Undo XE3+ TFiler/TReader/TParser/TStream TBytes usage, replace SetLength with SetLengthUninit for special cases
- Added: -x-fpr generates 3 times faster stack memory page probing code (RSP-19826)
- Added: Options -x-O1, -x-O2, -x-O3, -x-Ox that enable other optimization options
Download (fastdcc for the command line compiler):
Delphi 10.2 Tokyo changed how the files for units that are explicitly specified in the project file (dpr/dpk) are found. The IDE Fix Pack’s directory file search cache still made assumptions that were correct in Delphi 2009-10.1 Seattle but aren’t in Tokyo. This could result in an “program or unit xxx recursively uses itself” error messages if you had a relative path in the filename that is specified in the project file.
This release also adds another option to the compiler codegen optimizations. The new -x-orc / -x-orc=n option allows you to eliminate temporary record copies for functions like “begin Result := FRedirect.GetRecord; end;”. This optimization allows the compiler to skip the try/finally block with InitializeRecord/CopyRecord/FinalizeRecord calls for the temporary record variable that is then copied to the actual result record.
- Fixed: Directory search cache failed if project units had “..\” in it (Delphi 10.2 only)
- Fixed: Some VirtualProtect calls specified nil as last argument what Windows 10 1709 doesn’t like if a debugger is attached.
- Added: Option -x-orc and -x-orc=n to remove temporary record variables for function results (n=1: only if the assignment is the last statement, n=2 for all)
Download (fastdcc for the command line compiler):
A small update for IDE Fix Pack that adds support for the just released Delphi 10.2.2.
- Added: Support for Delphi 10.2 Update 2
- Added: Fix for unnecessary temporary variable if an empty open array argument is part of a function call (Delphi 2009-10.1)
Download (fastdcc for the command line compiler):
While debugging the String4D code to hunt down a bug in the CompilerSpeedPack, I saw a lot of CopyRecord/FinalizeRecord calls with a try/finally that the compiler generated.
If you have a record with managed fields (string, interface, …) and use it as a function return type the Delphi compiler will change the function into a procedure with an extra var-parameter. So you would think that the compiler treats the result-parameter like a normal var-parameter, but that isn’t the case. The compiler will generate code that guarantees that the result record isn’t changed if the called function throws an exception. For this is adds a temporary record that is used for the function result and then copies it to the target record.
TMyRec = record
function InternalGetRec: TMyRec;
Result.Value := 'Hello';
function GetRec: TMyRec;
Result := InternalGetRec;
R := GetRec;
The compiler rewrites the “Test” function to:
R, TempR: TMyRec;
CopyRecord(TempR, R, TypeInfo(TMyRec));
FinalizeArray([TempR, R], TypeInfo(TMyRec));
The same happens if you assign another function’s record result value to your own record result value. The compiler rewrites the “GetRec” function’s code to:
function GetRec: TMyRec;
CopyRecord(TempResult, Result, TypeInfo(TMyRec));
Because the compiler assumes that you may want to use “Result” in the function after the call, it has to guarantee that it is unchanged if an exception is thrown. But if it is the last statement in the function and not secured by an explicit try/finally/except where “Result” is used again, an optimization could be to omit the temporary record, making the code a lot faster.
With the release of IDE Fix Pack 6.1, the Compiler Speed Pack not only makes the compiler compile faster but it can now also change the generated code, something that IDE Fix Pack has never done before. For this, new command line compiler options are introduced. They all start with “-x” (eXtension) followed by the Compiler Speed Pack option (-x-ff -x-fdi -x-fvs -x-fpr) and if you want to use them from the command line compiler you need to use fastdcc32/fastdcc64.
You can specify these options in the “Project/Options…” dialog under “Delphi Compiler/Compiling/Additional options to pass to the compiler”. You may need to rebuild the project to see an effect as only then the compiler will generate new code.
Windows 10 Creators Update 1703 caused issues with all Delphi programs, libraries and packages because it changed how Windows loads imported DLLs in such a way that it causes performance issues and can crash the debugger. Delphi 10.2 Tokyo Update 2 fixed this by not producing multiple dll import sections for one DLL anymore. IDE Fix Pack 6.1 implements that “feature” for all previous Delphi versions (2009-10.1 Berlin) and extends it to not only eliminate duplicate dll imports but also duplicate delay dll imports.
This patch changes the generated binary, the Win32 and Win64 compiler outputs, and it can be disabled by using the new “-x-fdi-“ option.
The next patch that changes the Win32 code generator is the “fast floating point” option that C++Builder users may know from the “bcc32.exe -ff” option. It removes all “fwait” instructions that the compiler usually emits after floating point operations. Removing “fwait” may cause FPU exceptions to be thrown at the wrong source code line.
This option is disabled by default and can be enabled by specifying the new option “-x-ff”.
When calling virtual methods through an interface the Win32 Delphi compiler has to route that call through a helper function that translates the interface reference into an object reference and calls the virtual method. For this helper the compiler uses the XCHG instruction that has an implicit CPU LOCK.
The new “-x-fvs” / “-x-fvs=1” option replaces the XCHG instruction with an alternative code and if the called virtual function doesn’t use the ECX register for the 3rd parameter, it generates a direct jump into the virtual method.
The “-x-fvs=2” option replaces the XCHG and uses ECX if available but if ECX is not available it keeps the CPU’s “return stack cache” valid by replacing the RET with a JMP instructions. For this it uses stack memory below ESP.
For some functions that meet special conditions the Win32 compiler emits stack frame code that fills the stack with zeros to clear variables with managed types. If there are too many of those the compiler uses a loop and the XCHG instruction to restore the ECX register that is used as the loop counter.
The option “-x-fpr” replaces the XCHG with an alternative code.
- Added: Option -x-ff to enable “fast floating point” (like Borland C++’s -ff command line option)
- Added: Option -x-fvs and -x-fvs=n to enable fast interface virtual stub (n=1: replace XCHG, n=2: keep the CPU’s return stack buffer in order)
- Added: Option -x-fpr to remove XCHG from the function prolog code.
- Added: DLL import table section folding and duplicate name/ordinal elimination, also for delay dll imports
- Changed: Split “Compiler64.X86″ patch into multiple smaller patches and removed the “Compiler64.X86” patch name
- Changed: EditorFocusFix now skips the SetActiveWindow call if the mainform (undocked) is not the active window
The new IDE Fix Pack version 6.0 is available. It supports Delphi 10.2 RTM and 10.2 Update 1. And after over a year of being in BETA testing without any bug reports, I also included all the Win64 compiler performance optimizations. Thus the jump to version 6.0 can finally be done as they make the Win64 compiler up to 50% faster.
- Added: Win64 compile speed optimizations
- Added: Delphi 10.2 Update 1 support
- Added: Editor Block Completion UTF8 fix (Delphi 2009 only)
Debugger Callstack Resolver is a Delphi IDE plugin that I wrote in 2011 to make the IDE’s CPU View more readable. It colors different instructions, resolves absolute and memory address jump and call targets and shows their function name if available. It also uses the *.jdbg to show more information (the dcc32 compiler’s jdbg files are for the debug build of the compiler and haven’t match the deployed version since jdbg files exist)
This plugin supports Delphi 2009 – 10.1 Berlin.
Due to many requests, I took the time to update my tools (IDE Fix Pack, DDevExtensions, DFMCheck) for the newest Delphi version. Delphi 10.2 RTM will be the last version that all my tools support. I won’t be able to recompiled them for any 10.2 Update because my update subscription has expired and I didn’t (intentionally) renew it. Coincidentally I didn’t get a renewal message this time either.
This doesn’t mean that I won’t release newer versions of my tools, but I can’t support newer Delphi versions or 10.2 updates if the tools need to be recompiled. When I find some new possible optimizations or fixes for annoying bugs and time to work on them, Delphi 2009-10.2 RTM users will be able to use them.
- Fixed: Disable DynArraySetLength patch if 10.1 Berlin Update 2 is detected.
- Fixed: “clang template debug symbol bloat” disabled for 10 Seattle and newer.
- Added: IDE minimize doesn’t shrink main window to width and height zero.
- Added: RAD Studio 10.2 support
There is also a new IDE Fix Pack 6.0 Beta 4 that contains all the above and the experimental 64 bit compiler performance optimizations.
This RAD Studio IDE plugin enables the F12 debug hotkey for the Win32 debugger for Windows Vista, 7, 8, 8.1 and Windows 10.
If you don’t know what this plugin is for, then you don’t need it.
Eugene Kotlyarov found a problem with Castalia’s clipboard history feature combined with a TRichEdit in a debugged application. If you copy text from the RichEdit to the clipboard the first time, the IDE and the debugged application stop responding and start reacting after a ~30 seconds timeout. After that, copy to clipboard works until you restart the debugged application. Win32 and Win64 have the same problem.
The debugged application invokes WM_COPY by Ctrl+C/Ctrl+Insert/Context-menu “Copy”.
WM_COPY set the RichEdit as clipboard renderer without putting the actual data into the clipboard.
Castalia’s TClipboardHistoryForm receives the WM_CLIPBOARDUPDATE message in the main thread and calls GetClipboardData (via Clipboard.AsText).
The debugged application’s RichEdit loads a DLL to provide the actual clipboard data for the GetClipboardData call.
The debugger receives a LOAD_DLL_DEBUG_EVENT for the DLL.
The debugger posts a message to the debugger window in the main thread and goes to sleep with WaitForSingleObject to wait for the debugger window to process the DLL load event.
Now everything waits. GetClipboardData is blocked because the clipboard owner, the debugged application, is trapped in a debugger event. The debugger is blocked because the debugger window doesn’t get the posted message because the blocked GetClipboardData prevents the main thread from processing messages.
Fortunately a Microsoft developer knew about this possible deadlock and put a timeout into GetClipboardData.
A solution that I’ve put into the IDE Fix Pack development version (not available) moves the WM_CLIPBOARDUPDATE message handling into a thread window, so that the call to GetClipboardData doesn’t block the main thread and the debugger can process the DLL load event.