Castalia’s Clipboard history + TRichEdit = IDE deadlock

By | August 15, 2016

Eugene Kotlyarov found a problem with Castalia’s clipboard history feature combined with a TRichEdit in a debugged application. If you copy text from the RichEdit to the clipboard the first time, the IDE and the debugged application stop responding and start reacting after a ~30 seconds timeout. After that, copy to clipboard works until you restart the debugged application. Win32 and Win64 have the same problem.
https://plus.google.com/+EugeneKotlyarovPlus/posts/R783AB3DfYL

What happens:

  • The debugged application invokes WM_COPY by Ctrl+C/Ctrl+Insert/Context-menu “Copy”.
  • WM_COPY set the RichEdit as clipboard renderer without putting the actual data into the clipboard.
  • Castalia’s TClipboardHistoryForm receives the WM_CLIPBOARDUPDATE message in the main thread and calls GetClipboardData (via Clipboard.AsText).
  • The debugged application’s RichEdit loads a DLL to provide the actual clipboard data for the GetClipboardData call.
  • The debugger receives a LOAD_DLL_DEBUG_EVENT for the DLL.
  • The debugger posts a message to the debugger window in the main thread and goes to sleep with WaitForSingleObject to wait for the debugger window to process the DLL load event.

Now everything waits. GetClipboardData is blocked because the clipboard owner, the debugged application, is trapped in a debugger event. The debugger is blocked because the debugger window doesn’t get the posted message because the blocked GetClipboardData prevents the main thread from processing messages.

Fortunately a Microsoft developer knew about this possible deadlock and put a timeout into GetClipboardData.

 

A solution that I’ve put into the IDE Fix Pack development version (not available) moves the WM_CLIPBOARDUPDATE message handling into a thread window, so that the call to GetClipboardData doesn’t block the main thread and the debugger can process the DLL load event.

What’s wrong with virtual methods called through an interface

By | May 31, 2016

Calling a virtual method through an interface always was a lot slower than calling a static method through an interface. But why is that? Sure, the virtual method call costs some time, but comparing it with the difference of a normal static and virtual method call shows that the timings diverge too much.

i7-4790 3.6GHz
10,000,000 calls to empty method
Instance call Interface call
Static method 12 ms 17 ms
Virtual method 17 ms 164 ms

Let’s assume we have this declaration:

type
  IMyInterface = interface
    procedure Test(A, B: Integer);
  end;
  TTest = class(TInterfacedObject, IMyInterface)
  public
    procedure Test(A, B: Integer); virtual;
  end;

The compiler will generate a helper function for the interface method “Test”. This helper converts the “MyIntf” interface reference in the call “MyIntf.Test()” to the object reference behind the interface and then jumps to the virtual method.

add eax,-$0C   // convert the interface reference to the object reference
push eax       // save the object reference on the stack
mov eax,[eax]  // access the VMT
mov eax,[eax]  // get the “Test” VMT method address
xchg [esp],eax // swap the object ref on the stack with the method address
ret            // do the jump to the method address

This is very slow as you can see in the table above. If you know the “XCHG mem,reg” instruction, then you also know that it has an implicit “CPU LOCK” that slows down the method call a lot. But why is it using the XCHG instruction in the first place? Well, we are in between a method call. All the parameters are already loaded in to EAX, EDX and ECX. So we can’t use those to do the swap. The only way is to use the stack as temporary variable, and XCHG seemed to be the choice of the compiler engineer at the time interfaces were introduced to Delphi.

Let’s change that code to not use XCHG.

add eax,-$0 C      // convert the interface reference to the object reference
push eax           // reserve space for the method address used by RET
push eax           // save the object reference on the stack
mov eax,[eax]      // access the VMT
mov eax,[eax]      // get the “Test” VMT method address
mov [esp+04],eax   // write the method address to the reserved space
pop eax            // restore the object reference
ret                // do the jump to the method address

i7-4790 3.6GHz
10,000,000 calls to empty method
Instance call Interface call
Static method 12 ms 17 ms
Virtual method 17 ms 99 ms
Virtual method (XCHG) 164 ms

This is a lot faster, but still slow compared to the “Instance call”. The helper has a lot of memory accesses, but they shouldn’t slow it that much down, especially not in a tight loop when everything comes from the CPU’s cache.

So where does the code spend the time? Well, modern CPUs (after P1) have a feature called “return stack buffer”. The CPU puts the return address on the “return stack buffer” for every CALL instruction so it can predict where the RET instruction will jump to. This requires that every CALL is matched by a RET. But wait, the helper uses a RET for an indirect jump. We have the CALL from the interface method call, the RET in the helper and the RET in the actual method. That doesn’t match up. In other words, this helper renders the “return stack buffer” invalid what comes with a performance hit because the CPU can’t predict where to jump.

Let’s see what happens if we replace the RET with a JMP.

add eax,-$0C        // convert the interface reference to the object reference
push eax            // save the object reference on the stack
mov eax,[eax]       // access the VMT
push DWORD PTR [eax]// save the “Test” VMT entry method address on the stack
add esp,$04         // skip the method address stack entry
pop eax             // restore the object reference
jmp [esp-$08]       // jump to the method address

i7-4790 3.6GHz
10,000,000 calls to empty method
Instance call Interface call
Static method 12 ms 17 ms
Virtual method 17 ms 24 ms
Virtual method (RET) 99 ms
Virtual method (XCHG) 164 ms

UPDATE: As fast as this implementation may be, it has a problem. As Allen and Mark pointed out, it accesses memory on the stack that is treated as free memory from the system. So if a hardware interrupt is triggered between the “add esp,$04” and the “jmp [esp-$08]”, the data on the stack is overwritten and the jump will end somewhere but not where it should be.

UPDATE 2: Thorsten Engler sent me an e-mail that invalidates the “hardware interrupt problem”. All interrupts are handled in kernel mode and kernel mode code doesn’t touch the user stack. The CPU itself switches the SS:ESP before invoking the interrupt handler.

Based on AMD64 Architecture Programmer’s Manual Volume 2 – System Programming Rev.3.22 Section 8.7.3 Interrupt To Higher Privilege:

When a control transfer to an exception or interrupt handler running at a higher privilege occurs (numerically lower CPL value), the processor performs a stack switch using the following steps:

  1. The target CPL is read by the processor from the target code-segment DPL and used as an index into the TSS for selecting the new stack pointer (SS:ESP). For example, if the target CPL is 1, the processor selects the SS:ESP for privilege-level 1 from the TSS.
  2. Pushes the return stack pointer (old SS:ESP) onto the new stack. The SS value is padded with two bytes to form a doubleword.

System.ByteStrings for 10.1 Berlin

By | May 31, 2016

Delphi 10.1 Berlin reintroduces UTF8String and RawByteString for the NextGen compilers (Android, iOS). But ShortString and AnsiString are still missing. The compiler has full support for them but you can’t use them because they are declared with a leading underscore in the System.pas unit what makes them inaccessible because “_” is compiled to “@” what you can’t use for an identifier.

By patching DCU files it is possible to make those hidden types accessible.

The unit System.ByteStrings for 10.1 Berlin reintroduces

  • ShortString
  • AnsiString
  • AnsiChar
  • PAnsiChar
  • PPAnsiChar
  • UTF8String (XE5-10 Seattle)
  • PUTF8String (XE5-10 Seattle)
  • RawByteString (XE5-10 Seattle)
  • PRawByteString (XE5-10 Seattle)

Usage:
Add the System.ByteStrings.dcu’s path to the compiler’s search path and add the unit to your uses clauses.

There is no *.PAS file because the DCU is patched with a hex editor to get access to the hidden types.

Name IDE Version File Size Downloads Added
System.ByteStrings XE5 RTM/UP1 only XE5ByteStrings.7z 2.45 KB 1572 times 2013-10-23
System.ByteStrings XE5 UP2 only XE5Up2ByteStrings.7z 2.85 KB 1498 times 2013-12-20
System.ByteStrings XE6 XE6ByteStrings.7z 2.89 KB 1329 times 2014-04-16
System.ByteStrings XE7 XE7ByteStrings.7z 2.89 KB 1661 times 2015-01-20
System.ByteStrings XE8 XE8ByteStrings.7z 3.69 KB 1718 times 2015-04-16
System.ByteStrings 10 Seattle D10ByteStrings.7z 3.67 KB 1900 times 2015-09-01
System.ByteStrings 10.1 Berlin D101ByteStrings.7z 3.72 KB 2369 times 2016-05-31

IDE Fix Pack 5.95 for Delphi 10.1 Berlin

By | May 30, 2016

IDE Fix Pack 5.95 supports RAD Studio 10.1 Berlin.

When Windows Defender (or any other anti-virus tool) sees the compiler creating a DCU file and the compiler calls CloseHandle on the file handle, the virus/malware scanner blocks the thread and takes its time to have a look at the file. This causes CloseHandle to take more than 2 milliseconds per file on my system. If you have 2500 units this sums up to 5 seconds. With Windows Defender disabled those 5 seconds go back to under 100 milliseconds.
Because the compiler can’t work in those 5 seconds, the IDE Fix Pack now delegates the CloseHandle calls to a background thread. This means Windows Defender can scan the file while the compiler works on the next units without being blocked by the scan.
On my Win32 test project this parallel execution made the rebuild 5 seconds faster. IDE Fix Pack guarantees that all written DCU files are closed before the binary executable is created. So if you have only some units you won’t see much of a speed improvement because the main thread may wait for the background thread to close all remaining files.

Changelog:

  • Added: RAD Studio 10.1 Berlin support
  • Added: CloseHandle for created DCU files is delegated to a background thread. Windows Defender “workaround”
  • Added: Fix for RSP-14557: DynArraySetLength – resizing an array of managed type is causing entire copy instead of realloc (D10.1, only the IDE)
  • Added: Fix for RSP-13116: TCustomImageList.BeginUpdate/EndUpdate (D10.0)

Download:

Name IDE Version File Size Downloads Added
IDE Fix Pack 6.4.2 2009 (UP4) IDEFixPack2009Reg64.2.7z 242.75 KB 5652 times 2019-03-23
IDE Fix Pack 6.4.2 2010 (UP5) IDEFixPack2010Reg64.2.7z 237.09 KB 6464 times 2019-03-23
IDE Fix Pack 6.4.2 XE (UP1) IDEFixPackXEReg64.2.7z 221.38 KB 3996 times 2019-03-23
IDE Fix Pack 6.4.2 XE2 (UP4+HF1) IDEFixPackXE2Reg64.2.7z 316.78 KB 4409 times 2019-03-23
IDE Fix Pack 6.4.2 XE3 (UP2) IDEFixPackXE3Reg64.2.7z 257.4 KB 3547 times 2019-03-23
IDE Fix Pack 6.4.2 XE4 (UP1) IDEFixPackXE4Reg64.2.7z 260.1 KB 3162 times 2019-03-23
IDE Fix Pack 6.4.2 XE5 (UP2) IDEFixPackXE5Reg64.2.7z 257.7 KB 3643 times 2019-03-23
IDE Fix Pack 6.4.2 XE6 (UP1) IDEFixPackXE6Reg64.2.7z 423 KB 3339 times 2019-03-23
IDE Fix Pack 6.4.2 XE7 (UP1) IDEFixPackXE7Reg64.2.7z 429.48 KB 4503 times 2019-03-23
IDE Fix Pack 6.4.2 XE8 (UP1) IDEFixPackXE8Reg64.2.7z 431.7 KB 3794 times 2019-03-23
IDE Fix Pack 6.4.2 10 Seattle (RTM/UP1) IDEFixPackD10Reg64.2.7z 428.33 KB 5274 times 2019-03-23
IDE Fix Pack 6.4.2 10.1 Berlin IDEFixPackD101Reg64.2.7z 430.65 KB 5862 times 2019-03-23
IDE Fix Pack 6.4.2 10.2 (RTM/UP1/2/3) IDEFixPackD102Reg64.2.7z 426.27 KB 9285 times 2019-03-23
IDE Fix Pack 6.4.4 10.3 (RTM/UP1/2/3) IDEFixPackD103Reg64.4.7z 444.98 KB 17680 times 2019-08-01

Download (fastdcc):

Name IDE Version File Size Downloads Added
fastdcc 6.4.2 2009 (UP4) fastdcc2009v64.2.7z 112.87 KB 3035 times 2019-03-23
fastdcc 6.4.2 2010 (UP5) fastdcc2010v64.2.7z 120.38 KB 3157 times 2019-03-23
fastdcc 6.4.2 XE (UP1) fastdccXEv64.2.7z 121.36 KB 2903 times 2019-03-23
fastdcc 6.4.2 XE2 (UP4+HF1) fastdccXE2v64.2.7z 166.48 KB 2942 times 2019-03-23
fastdcc 6.4.2 XE3 (UP2) fastdccXE3v64.2.7z 150.88 KB 2777 times 2019-03-23
fastdcc 6.4.2 XE4 (UP1) fastdccXE4v64.2.7z 153.55 KB 2738 times 2019-03-23
fastdcc 6.4.2 XE5 (UP2) fastdccXE5v64.2.7z 151.87 KB 2842 times 2019-03-23
fastdcc 6.4.2 XE6 (UP1) fastdccXE6v64.2.7z 198.67 KB 2837 times 2019-03-23
fastdcc 6.4.2 XE7 (UP1) fastdccXE7v64.2.7z 219.84 KB 3023 times 2019-03-23
fastdcc 6.4.2 XE8 (UP1) fastdccXE8v64.2.7z 224.67 KB 2869 times 2019-03-23
fastdcc 6.4.2 10 Seattle (RTM/UP1) fastdccD10v64.2.7z 219.65 KB 3193 times 2019-03-23
fastdcc 6.4.2 10.1 Berlin fastdccD101v64.2.7z 223.52 KB 3267 times 2019-03-23
fastdcc 6.4.2 10.2 (RTM/UP1/2/3) fastdccD102v64.2.7z 219.06 KB 4088 times 2019-03-23
fastdcc 6.4.4 10.3 (RTM/UP1/2/3) fastdccD103v64.4.7z 228.61 KB 5552 times 2019-07-31

There is also a new IDE Fix Pack 6.0 Beta 3 that contains all the above and the experimental 64 bit compiler performance optimizations.

Download:

Name IDE Version File Size Downloads Added

DDevExtensions and DFMCheck for 10.1 Berlin

By | May 29, 2016

The DDevExtensions and the DFMCheck IDE plugins are now available for 10.1 Berlin.

Download:

Name IDE Version File Size Downloads Added
DDevExtensions 1.61 5-2007 DDevExtensions161Setup.zip 734.07 KB 20277 times 2009-01-10
DDevExtensions 2.8 Features PDF DDevExtensionsFeatures.pdf 602.92 KB 18099 times 2014-12-27
DDevExtensions 2.4 7, 2007 DDevExtensions24Setup7_2007.zip 535.41 KB 13222 times 2011-07-25
DDevExtensions 2.86 2009-10.3 DDevExtensions286.7z 1.24 MB 6096 times 2020-05-30
DDevExtensions 2.88 2009-10.4.2 DDevExtensions288.7z 1.3 MB 5426 times 2021-07-20
Name IDE Version File Size Downloads Added
DFMCheck 1.6 5-10.3 DfmCheckSetup16.7z 717.43 KB 3064 times 2018-12-08

DDevExtensions Changelog:

  • Version 2.84 (2016-05-28)
    • Added: TAB key works like ENTER in the CodeInsight window.
    • Added: 10.1 Berlin support

DFMCheck 1.6 released

By | January 25, 2016

I made a small update for DFMCheck RAD Studio IDE plugin. The new version 1.6 adds a “Yes to all” button to the “Open/Close all forms” confirmation dialog for modified DFM files and it uses the Vista+ taskbar progress feature.

More information about DFMCheck:
https://www.idefixpack.de/blog/ide-tools/dfmcheck/

Download:

Name IDE Version File Size Downloads Added
DFMCheck 1.6 5-10.3 DfmCheckSetup16.7z 717.43 KB 3064 times 2018-12-08

IDE Fix Pack 6.0 BETA

By | December 19, 2015

The IDE Fix Pack 6.0 BETA for XE2-10Seattle focuses on the Win64 compiler’s compile speed. It not only uses SSE2-SSE4.1 instructions to increase the compiler’s performance but it also optimizes for the most common cases. Micro optimizations are used for tight loops and functions that are called a million times.

Even with this patch the Win64 compiler isn’t on par with the Win32 compiler’s speed, but the difference isn’t that huge any more and there are still some possible optimization. But it takes a lot of time to analyze those and write the patches. And I if I would hold back the version 6.0 till all of those are completed, you will never see a version 6.0.

Because those changes to the compiler may intro bugs, I’m not releasing a gold version but a BETA version. That means that this isn’t meant for production use.

Don’t forgot: I’m not Embarcadero and I’m not fixing somebody’s pet bug. This BETA is all about the Win64 compiler speed optimizations. So please don’t report bugs (in the comments, at Google+, email) that aren’t related to the Win64 compiler speed optimizations. If the compiler outputs defect code with and without my patches, I won’t (and can’t) help you.

BETA Download:

Name IDE Version File Size Downloads Added

IDE Fix Pack 5.94 released – RAD Studio 10 Seattle support

By | November 1, 2015

IDE Fix Pack 5.94 supports RAD Studio 10 Seattle and adds some additional patches.

Changelog:

  • Added: RAD Studio 10 Seattle support
  • Added: Patch for Clipboard History exception from 10 Seattle Castalia integration (10Seattle)
  • Added: timeBeginPeriod/timeEndPeriod calls from IDEVirtualTrees disabled (battery drain)
  • Added: Removed unnecessary memory reallocations for 64bit and AARM compiler (XE4+)
  • Added: CodeInsight popup window border fix for Windows 10.

Download:

Name IDE Version File Size Downloads Added
IDE Fix Pack 6.4.2 2009 (UP4) IDEFixPack2009Reg64.2.7z 242.75 KB 5652 times 2019-03-23
IDE Fix Pack 6.4.2 2010 (UP5) IDEFixPack2010Reg64.2.7z 237.09 KB 6464 times 2019-03-23
IDE Fix Pack 6.4.2 XE (UP1) IDEFixPackXEReg64.2.7z 221.38 KB 3996 times 2019-03-23
IDE Fix Pack 6.4.2 XE2 (UP4+HF1) IDEFixPackXE2Reg64.2.7z 316.78 KB 4409 times 2019-03-23
IDE Fix Pack 6.4.2 XE3 (UP2) IDEFixPackXE3Reg64.2.7z 257.4 KB 3547 times 2019-03-23
IDE Fix Pack 6.4.2 XE4 (UP1) IDEFixPackXE4Reg64.2.7z 260.1 KB 3162 times 2019-03-23
IDE Fix Pack 6.4.2 XE5 (UP2) IDEFixPackXE5Reg64.2.7z 257.7 KB 3643 times 2019-03-23
IDE Fix Pack 6.4.2 XE6 (UP1) IDEFixPackXE6Reg64.2.7z 423 KB 3339 times 2019-03-23
IDE Fix Pack 6.4.2 XE7 (UP1) IDEFixPackXE7Reg64.2.7z 429.48 KB 4503 times 2019-03-23
IDE Fix Pack 6.4.2 XE8 (UP1) IDEFixPackXE8Reg64.2.7z 431.7 KB 3794 times 2019-03-23
IDE Fix Pack 6.4.2 10 Seattle (RTM/UP1) IDEFixPackD10Reg64.2.7z 428.33 KB 5274 times 2019-03-23
IDE Fix Pack 6.4.2 10.1 Berlin IDEFixPackD101Reg64.2.7z 430.65 KB 5862 times 2019-03-23
IDE Fix Pack 6.4.2 10.2 (RTM/UP1/2/3) IDEFixPackD102Reg64.2.7z 426.27 KB 9285 times 2019-03-23
IDE Fix Pack 6.4.4 10.3 (RTM/UP1/2/3) IDEFixPackD103Reg64.4.7z 444.98 KB 17680 times 2019-08-01

Download (fastdcc):

Name IDE Version File Size Downloads Added
fastdcc 6.4.2 2009 (UP4) fastdcc2009v64.2.7z 112.87 KB 3035 times 2019-03-23
fastdcc 6.4.2 2010 (UP5) fastdcc2010v64.2.7z 120.38 KB 3157 times 2019-03-23
fastdcc 6.4.2 XE (UP1) fastdccXEv64.2.7z 121.36 KB 2903 times 2019-03-23
fastdcc 6.4.2 XE2 (UP4+HF1) fastdccXE2v64.2.7z 166.48 KB 2942 times 2019-03-23
fastdcc 6.4.2 XE3 (UP2) fastdccXE3v64.2.7z 150.88 KB 2777 times 2019-03-23
fastdcc 6.4.2 XE4 (UP1) fastdccXE4v64.2.7z 153.55 KB 2738 times 2019-03-23
fastdcc 6.4.2 XE5 (UP2) fastdccXE5v64.2.7z 151.87 KB 2842 times 2019-03-23
fastdcc 6.4.2 XE6 (UP1) fastdccXE6v64.2.7z 198.67 KB 2837 times 2019-03-23
fastdcc 6.4.2 XE7 (UP1) fastdccXE7v64.2.7z 219.84 KB 3023 times 2019-03-23
fastdcc 6.4.2 XE8 (UP1) fastdccXE8v64.2.7z 224.67 KB 2869 times 2019-03-23
fastdcc 6.4.2 10 Seattle (RTM/UP1) fastdccD10v64.2.7z 219.65 KB 3193 times 2019-03-23
fastdcc 6.4.2 10.1 Berlin fastdccD101v64.2.7z 223.52 KB 3267 times 2019-03-23
fastdcc 6.4.2 10.2 (RTM/UP1/2/3) fastdccD102v64.2.7z 219.06 KB 4088 times 2019-03-23
fastdcc 6.4.4 10.3 (RTM/UP1/2/3) fastdccD103v64.4.7z 228.61 KB 5552 times 2019-07-31

IDE Fix Pack 5.93 re-release for Summer 2015 XE7 Hotfix

By | October 7, 2015

IDE Fix Pack for XE7 fails to load if the Summer 2015 XE7 Hotfix is installed. To get around this incompatibility I re-released the IDE Fix Pack 5.93 for XE7. This re-release requires Update 1 and can run with and without the XE7 Summer 2015 Hotfix. fastdcc for XE7 isn’t affected by this.

Download:

Name IDE Version File Size Downloads Added
IDE Fix Pack 6.4.2 2009 (UP4) IDEFixPack2009Reg64.2.7z 242.75 KB 5652 times 2019-03-23
IDE Fix Pack 6.4.2 2010 (UP5) IDEFixPack2010Reg64.2.7z 237.09 KB 6464 times 2019-03-23
IDE Fix Pack 6.4.2 XE (UP1) IDEFixPackXEReg64.2.7z 221.38 KB 3996 times 2019-03-23
IDE Fix Pack 6.4.2 XE2 (UP4+HF1) IDEFixPackXE2Reg64.2.7z 316.78 KB 4409 times 2019-03-23
IDE Fix Pack 6.4.2 XE3 (UP2) IDEFixPackXE3Reg64.2.7z 257.4 KB 3547 times 2019-03-23
IDE Fix Pack 6.4.2 XE4 (UP1) IDEFixPackXE4Reg64.2.7z 260.1 KB 3162 times 2019-03-23
IDE Fix Pack 6.4.2 XE5 (UP2) IDEFixPackXE5Reg64.2.7z 257.7 KB 3643 times 2019-03-23
IDE Fix Pack 6.4.2 XE6 (UP1) IDEFixPackXE6Reg64.2.7z 423 KB 3339 times 2019-03-23
IDE Fix Pack 6.4.2 XE7 (UP1) IDEFixPackXE7Reg64.2.7z 429.48 KB 4503 times 2019-03-23
IDE Fix Pack 6.4.2 XE8 (UP1) IDEFixPackXE8Reg64.2.7z 431.7 KB 3794 times 2019-03-23
IDE Fix Pack 6.4.2 10 Seattle (RTM/UP1) IDEFixPackD10Reg64.2.7z 428.33 KB 5274 times 2019-03-23
IDE Fix Pack 6.4.2 10.1 Berlin IDEFixPackD101Reg64.2.7z 430.65 KB 5862 times 2019-03-23
IDE Fix Pack 6.4.2 10.2 (RTM/UP1/2/3) IDEFixPackD102Reg64.2.7z 426.27 KB 9285 times 2019-03-23
IDE Fix Pack 6.4.4 10.3 (RTM/UP1/2/3) IDEFixPackD103Reg64.4.7z 444.98 KB 17680 times 2019-08-01