David Nadlinger2023-08-07T18:22:35+01:00http://klickverbot.at/David Nadlingerdavid+atom@klickverbot.atSystemd units for Buildbot in Conda2018-08-25T00:00:00+01:00http://klickverbot.at/blog/2018/08/buildbot-conda-systemd-units<p>Buildbot is a Python framework for continuous integration systems. In <a href="https://www2.physics.ox.ac.uk/research/ion-trap-quantum-computing-group">my research group</a> we are deploying it in a <a href="https://conda.io/">Conda</a> environment, which we also use to manage all the different moving parts of our Python-centric control infrastructure on both Windows and Linux. To start up the master and worker services, the corresponding Conda environment needs to be activated first. This is easiest to achieve using simple wrapper scripts.</p>
<p>For this, let’s assume we’ve created a <code>bb</code> user, with the master and worker configurations in its home directory (<code>~/master</code> and <code>~/worker</code>). First, create a wrapper script to start up the master process:</p>
<figure class="code"> <div class="highlight"><pre><span class="c">#!/bin/bash</span>
<span class="nb">set</span> -eo pipefail
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>~/anaconda3/bin:<span class="nv">$PATH</span>
<span class="nb">source </span>activate buildbot
buildbot start --nodaemon master
</pre></div><figcaption><span>~bb/start-master.sh </span></figcaption>
</figure>
<p>This assumes Conda has been installed into <code>~bb/anaconda3</code>, and the environment with the Buildbot installation is called <code>buildbot</code>. <code>master</code> is the name of the configuration directory, and <code>--nodaemon</code> prevents daemonisation (i.e. keeps the process running in the foreground).</p>
<p>Make the script executable, and create a Systemd unit file that invokes it:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">[Unit]</span>
<span class="na">Description</span><span class="o">=</span><span class="s">Buildbot master service</span>
<span class="na">After</span><span class="o">=</span><span class="s">network.target</span>
<span class="k">[Service]</span>
<span class="na">User</span><span class="o">=</span><span class="s">bb</span>
<span class="na">Group</span><span class="o">=</span><span class="s">bb</span>
<span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">/home/bb</span>
<span class="na">ExecStart</span><span class="o">=</span><span class="s">/home/bb/start-master.sh</span>
<span class="na">ExecReload</span><span class="o">=</span><span class="s">/bin/kill -HUP $MAINPID</span>
<span class="k">[Install]</span>
<span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</pre></div><figcaption><span>/etc/systemd/system/buildbot-master.service </span></figcaption>
</figure>
<p>To start the master process, run</p>
<figure class="code"> <div class="highlight"><pre>systemctl start buildbot-master
</pre></div></figure>
<p>and to do so every time the system boots:</p>
<figure class="code"> <div class="highlight"><pre>systemctl <span class="nb">enable </span>buildbot-master
</pre></div></figure>
<hr />
<p>The analogous configuration for the worker process is</p>
<figure class="code"> <div class="highlight"><pre><span class="c">#!/bin/bash</span>
<span class="nb">set</span> -eo pipefail
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span>~/anaconda3/bin:<span class="nv">$PATH</span>
<span class="nb">source </span>activate buildbot
buildbot-worker start --nodaemon worker
</pre></div><figcaption><span>~bb/start-worker.sh </span></figcaption>
</figure>
<p>and</p>
<figure class="code"> <div class="highlight"><pre><span class="k">[Unit]</span>
<span class="na">Description</span><span class="o">=</span><span class="s">Buildbot worker service</span>
<span class="na">After</span><span class="o">=</span><span class="s">network.target</span>
<span class="k">[Service]</span>
<span class="na">User</span><span class="o">=</span><span class="s">bb</span>
<span class="na">Group</span><span class="o">=</span><span class="s">bb</span>
<span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">/home/bb</span>
<span class="na">ExecStart</span><span class="o">=</span><span class="s">/home/bb/start-worker.sh</span>
<span class="na">ExecReload</span><span class="o">=</span><span class="s">/bin/kill -HUP $MAINPID</span>
<span class="k">[Install]</span>
<span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</pre></div><figcaption><span>/etc/systemd/system/buildbot-worker.service </span></figcaption>
</figure>
<p>Start it using</p>
<figure class="code"> <div class="highlight"><pre>systemctl <span class="nb">enable </span>buildbot-worker
systemctl start buildbot-worker
</pre></div></figure>
<hr />
<p>This is it; Buildbot is now run automatically when the system boots. To avoid starting the graphical user interface on a desktop Ubuntu install, run <code>systemctl set-default multi-user.target</code>.</p>
Photographing a Single Atom2018-02-20T00:00:00+00:00http://klickverbot.at/blog/2018/02/photographing-a-single-atom<p class="lead">When I spent one Sunday night working on a photograph in our basement laboratory last August, I was admittedly quite pleased with the results. But I certainly didn’t expect the attention it recently received from news media around the globe after winning an award in a <a href="https://www.epsrc.ac.uk/newsevents/news/single-trapped-atom-captures-science-photography-competitions-top-prize/">photography competition</a>. In this post, I will try to provide some of the scientific background sorely missing from the original press release, and address a few commonly asked questions.</p>
<p>First of all, another look at the picture in question, taken on August 7, 2017, at 2:36 <span class="sc">AM</span> in the laboratories of the <a href="https://www2.physics.ox.ac.uk/research/ion-trap-quantum-computing-group">Ion Trap Quantum Computing group</a> (Prof. David Lucas and Prof. Andrew Steane) at the <a href="http://www2.physics.ox.ac.uk/">University of Oxford</a>. As part of my DPhil studies at <a href="https://www.balliol.ox.ac.uk">Balliol College</a>, I work on using ion traps for quantum computation, in particular towards <a href="http://nqit.ox.ac.uk/content/q2020-quantum-computer-demonstrator">distributing high-fidelity entanglement between several trap modules using optical links</a> (click for a high-quality version, 4.1 MiB):</p>
<figure><a href="/blog/2018/02/photographing-a-single-atom/nadlinger_single_atom_in_ion_trap_recrop_3200.jpg" target="_blank"><img alt="Photograph of a single Strontium atom in an ion trap." src="/blog/2018/02/photographing-a-single-atom/nadlinger_single_atom_in_ion_trap_recrop_1024.jpg" /></a><figcaption>An ion trap in an ultra-high vacuum vessel. In the centre of the picture, as small bright dot is visible – a single trapped <sup>88</sup>Sr<sup>+</sup> ion. (Overall 1<sup>st</sup> in the EPSRC 2018 Science Photography Competition; crop slightly changed here.)</figcaption></figure>
<p>Before getting into the details of the science behind all this, one particular misconception that has cropped up in the search for sensationalist headlines should be addressed:</p>
<hr />
<h3 id="is-this-an-advance-in-science-have-single-atoms-been-photographed-before">Is this an advance in science? Have single atoms been photographed before?</h3>
<p>In short: Not in the least; and yes, probably even before I was born.</p>
<p>First, the techniques that made this picture possible, ion traps and laser cooling, are part of the standard toolbox in modern physics experiments. The photo could have been taken in dozens—if not hundreds—of laboratories around the world, with any one of more than ten different species of atoms. To be very clear, the picture won a photography competition, not a science prize.</p>
<p>Nevertheless, the picture still showcases a lot of cool innovations in physics and engineering from the second half of the 20th century. To name just a few highlights—telling a scientific story in Nobel prizes necessarily paints a very incomplete picture: Both the aforementioned experimental techniques were recognized with the prestigious prize, ion traps in 1989, and the application of laser cooling to neutral atoms in 1997. In 2012, D. Wineland was awarded the Nobel prize for the development of methods to precisely manipulate the quantum state of trapped ions. This is the basis for a number of research groups around the world to investigate them as building blocks for quantum information applications, including ours.</p>
<p>On the second point, this is nowhere near the first picture of a single atom, probably by almost as much as forty years—I can’t help but think that someone working on the early ion trap experiments would have tried to take such a picture as well. Either way, taking pictures of single atoms has been a part of our experiments for a good ten years now, as a way of <a href="https://journals.aps.org/pra/abstract/10.1103/PhysRevA.81.040302">reading out the result of a quantum computation from our ion qubits</a> (<a href="https://arxiv.org/pdf/0906.3304.pdf">arXiv</a>)—the result in 0s and 1s is literally given as a pattern of bright dots and spots that stay dark. For another example, check out the <a href="https://www2.physics.ox.ac.uk/research/ion-trap-quantum-computing-group">second picture</a> on our group website, showing a string of <sup>43</sup>Ca<sup>+</sup> ions.</p>
<p>Neutral (uncharged) atoms can also be trapped using laser techniques. People have taken pictures of single atoms in this setting too, <a href="http://www.physics.otago.ac.nz/nx/mikkel/single-atom.html">such as this group in Otago, New Zealand</a>. Like the pictures used for ion qubit readout, these are typically taken with scientific cameras (lower noise, but usually monochrome) and through a microscope with narrow field of view. By using an ordinary camera, I was able to capture a picture in full colour, including more of the surrounding apparatus.</p>
<p>There is also this funny—possibly apocryphal—story from the early days of ion trapping, featuring a group around Hans Dehmelt (one of the above Nobel laureates) and a photo of a single Barium atom they took back in the very much analogue age of photography: Supposedly, their picture was mostly black, with only a few small bright spots from the ion itself and a bit of stray light hitting the trap electrodes. When they submitted the negatives for publication in some conference proceedings, the image editor promptly stamped out the atom, thinking it to be a speck of dust!</p>
<p>All in all, the scientific value of this image is virtually zero. Still, I hope it can convey some of the fascinating aspects of nature we get to explore in modern physics on a daily basis.</p>
<hr />
<p>The rest of this blog post is work in progress. For now, have a look at this 3D model of our trap, where the individual parts are easier to recognise than in the picture:</p>
<figure><img alt="3D model of the ion trap used in the picture, with different parts highlighted." src="/blog/2018/02/photographing-a-single-atom/blade_trap_render.jpg" /><figcaption>3D model of the trap from the photo. The <span class="sc">RF</span> and ground electrode pairs making the quadrupole potential are shown in yellow/lavender, the static "endcap" electrodes confining the ions along the trap axis in red. The purple wires are used to compensate for stray fields which would push the ion off the centre axis. On the top, the transparent cone illustrates the region our imaging system can collect photons from. (S. Woodrow, K. Thirumalai)</figcaption></figure>
<p>Colleagues of ours, the group around Rainer Blatt in Innsbruck, have <a href="https://quantumoptics.at/en/research/quantum-information.html">a few more pictures of similar traps on their website</a>.</p>
<p>A collection of some further links:</p>
<ul>
<li>
<p>Seeing a single atom in person is possible with a magnifying glass or a small microscope, as discussed in <a href="http://www.nytimes.com/1986/10/21/science/physicists-finally-get-to-see-quantum-jump-with-own-eyes.html">this New York Times article from 1986</a>. Apparently, this has been done even <a href="https://link.springer.com/chapter/10.1007/978-1-4612-4030-3_15">back in 1979</a>.</p>
</li>
<li>
<p>The radius of an atom is a bit tricky to define; according to one common definition (the space taken up in a molecular bond to another atom of the same species), the radius of the strontium atom is about 0.25 nm (a quarter of a billionth of a metre). When confined in a trap potential with frequency \(\omega\), its size is at least \(z_0 = \sqrt{\hbar / (2\ m_{Sr}\ \omega)}\) due to the Heisenberg uncertainty principle. For the trap parameters here, the radius is about 6.5 nm. This is far below the <a href="https://en.wikipedia.org/wiki/Diffraction-limited_system">diffraction limit</a> set by the light wavelength of 422 nm.</p>
</li>
<li>
<p>The temperature of the atom is approximately 0.5 mK, or 1 / 2 000 ºC above absolute zero (slightly above the <a href="https://en.wikipedia.org/wiki/Doppler_cooling#Minimum_temperature">Doppler limit</a>). Hence, the size due to “motion blur” would still be less than 300 nanometres.</p>
</li>
<li>
<p>The atom appears bigger due to imperfection in the lens and camera (<a href="https://en.wikipedia.org/wiki/Optical_aberration">optical aberrations</a>, plus the focus is slightly off). There seems to be a small amount of camera shake as well (the camera was mounted to the optical table in a somewhat precarious fashion using a cheap tripod head).</p>
</li>
<li>
<p>Since ions repel each other, one can take pretty pictures of interesting configurations of glowing atoms using a microscope. See for example <a href="https://www2.physics.ox.ac.uk/research/ion-trap-quantum-computing-group">our group website</a>, or the much fancier pictures by the groups at <a href="http://www.quantummetrology.de/quaccs/research/multi-ion-clocks/"><span class="sc">PTB</span> Braunschweig</a> and <a href="https://www.nist.gov/news-events/news/2016/06/nists-super-quantum-simulator-entangles-hundreds-ions"><span class="sc">NIST</span> Boulder</a>.</p>
</li>
<li>
<p>The photo was captured on August 7th, 2017, with a Canon EOS 5D Mark II and an <span class="sc">EF</span> 50 mm f/1.8 lens at 30s exposure time and f/4 (plus some extension tubes, and two flash units with colour gels).</p>
</li>
<li>
<p>The quantum efficiency and noise performance of modern digital camera sensors is surprisingly good. Roger N. Clark has collected a swath of useful information over at his website, see for example <a href="http://www.clarkvision.com/reviews/evaluation-canon-5dii/index.html">his page on the camera I used</a>, and his <a href="http://www.clarkvision.com/articles/digital.sensor.performance.summary/">overview of modern digital camera sensor performance</a>. The <a href="http://www.clarkvision.com/articles/digital.sensor.performance.summary/">Sensorgen.info</a> and <a href="http://www.photonstophotos.net/index.htm">Photons to Photos</a> websites also provide further information and data on sensor performance. In fact, it appears that if I had taken a closer look at the performance data before I took the picture, I could have optimised the settings a bit more to reduce the apparent noise.</p>
</li>
</ul>
<hr />
<p><em>[A long-form blog post is work in progress.]</em></p>
Testing Verilog AXI4-Lite Peripherals2016-01-30T00:00:00+00:00http://klickverbot.at/blog/2016/01/testing-verilog-axi4-lite-peripherals<p class="lead">Chips that combine one or more processor cores and FPGA fabric into one integrated system have become quite popular recently, the most well-known product being Xilinx’ ARM-based <a href="http://www.xilinx.com/products/silicon-devices/soc/zynq-7000.html"><em>Zynq</em></a> series. The standardized AXI buses connecting them make it trivial to bring custom IP cores into the processor address space. This post describes how to interface with it from a standalone Verilog test-bench.</p>
<p>The popularity these combined systems-on-a-chip have been enjoying lately in research labs is certainly in part due to the ease in which programmable logic can be connected to the CPU cores, as compared to having to design and implement an interface between a discrete ARM processor and a stand-alone FPGA chip. This is due to the fact that the Zynq chips feature several internal interconnects between the ARM cores and the programmable logic fabric (including access to the DDR system memory and cache coherency control). These buses follow ARM’s open AMBA AXI4 standard, which is available in several flavors: the base <em>AXI4</em> protocol, which defines a high-performance memory-mapped interface, <em>AXI4-Stream</em>, which realizes a unidirectional data flow with handshaking, and <em>AXI4-Lite</em>, which is similar to AXI4 but lacks advanced features like buffering, multiple widths and bursts. Each given device implementing one of these protocols can act as either a master or a slave.</p>
<p>Here, we will concern ourselves only with perhaps the simplest case, an AXI4-Lite slave. A typical example for this would be a low-bandwidth control channel from the ARM CPU to a custom IP core. Implementing such a device is quite easy as the Xilinx development environment includes tooling to generate the code for interfacing with the AXI bus (although it seems that, compared to the average programmer, FPGA designers lack any sensibility for writing <em>pretty</em> or even just consistently formatted code). But of course, this leaves the question of how to verify that the IP core reacts correctly to these commands – as it is usually the case for HDL design, you certainly don’t want to run the time-consuming synthesis process and re-flash the hardware on every iteration in the debugging process, only to then find yourself in an environment where it is hard to diagnose errors anyway unless you had enough foresight to litter the code with ChipScope debug probes in all the right places.</p>
<p>These days, I use <a href="http://iverilog.icarus.com/">Icarus Verilog</a> for most all of my simulation needs, except when some proprietary IP is involved for which no functional model is available outside the vendor tools. It is an open source project that provides a Verilog parser, optimizer and virtual machine, and together with a waveform viewer such as <a href="xxx">GtkWave</a> makes for a nice light-weight testing environment. For small-ish projects, it tends to have already finished the simulation before the clunky and bug-ridden vendor tools such as Xilinx <em>isim</em> would have even completed starting up.</p>
<p>Co-simulating the code to run on the ARM CPU and the FPGA design is certainly possible – maybe by using Verilator and piping data flow on the AXI buses back and forth between the domains, or by bringing out the “big guns”, i.e. system-level verification tools made by companies like Cadence. The most straightforward solution, however, is certainly to test the core in question in isolation, while just manually handling the necessary AXI communication in the test-bench.</p>
<p>Owing to the simplicity of the AXI4-Lite protocol, such functionality is not hard to implement. The “AMBA® AXI™ and ACE™ Protocol Specification” – available on the ARM website after logging in, and certainly floating around in other places as well – is quite clear and well-written. Interestingly, however, none of the templates provided by Xilinx seem to include the relevant pieces of HDL. So, without further ado, here is a Verilog task that reads a single word from the bus and compares it to the expected value:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">task</span> <span class="k">automatic</span> <span class="n">enforce_axi_read</span><span class="p">;</span>
<span class="k">input</span> <span class="p">[</span><span class="no">C_S_AXI_ADDR_WIDTH</span> <span class="o">-</span> <span class="mh">1</span> <span class="o">:</span> <span class="mh">0</span><span class="p">]</span> <span class="n">addr</span><span class="p">;</span>
<span class="k">input</span> <span class="p">[</span><span class="no">C_S_AXI_DATA_WIDTH</span> <span class="o">-</span> <span class="mh">1</span> <span class="o">:</span> <span class="mh">0</span><span class="p">]</span> <span class="n">expected_data</span><span class="p">;</span>
<span class="k">begin</span>
<span class="n">s_axi_araddr</span> <span class="o">=</span> <span class="n">addr</span><span class="p">;</span>
<span class="n">s_axi_arvalid</span> <span class="o">=</span> <span class="mh">1</span><span class="p">;</span>
<span class="n">s_axi_rready</span> <span class="o">=</span> <span class="mh">1</span><span class="p">;</span>
<span class="k">wait</span><span class="p">(</span><span class="n">s_axi_arready</span><span class="p">);</span>
<span class="k">wait</span><span class="p">(</span><span class="n">s_axi_rvalid</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">s_axi_rdata</span> <span class="o">!=</span> <span class="n">expected_data</span><span class="p">)</span> <span class="k">begin</span>
<span class="nb">$display</span><span class="p">(</span><span class="s">"Error: Mismatch in AXI4 read at %x: "</span><span class="p">,</span> <span class="n">addr</span><span class="p">,</span>
<span class="s">"expected %x, received %x"</span><span class="p">,</span>
<span class="n">expected_data</span><span class="p">,</span> <span class="n">s_axi_rdata</span><span class="p">);</span>
<span class="k">end</span>
<span class="p">@(</span><span class="k">posedge</span> <span class="n">s_axi_aclk</span><span class="p">)</span> <span class="p">#</span><span class="mh">1</span><span class="p">;</span>
<span class="n">s_axi_arvalid</span> <span class="o">=</span> <span class="mh">0</span><span class="p">;</span>
<span class="n">s_axi_rready</span> <span class="o">=</span> <span class="mh">0</span><span class="p">;</span>
<span class="k">end</span>
<span class="k">endtask</span>
</pre></div><figcaption><span>Reading a word from the AXI4-Lite bus and comparing it to an expected result. </span></figcaption>
</figure>
<p>All the <code>s_axi_…</code> signals are supposed to be hooked up to the corresponding ports of the unit under tests, as they would be in an auto-generated test-bench module. To use it, simply insert <code>enforce_axi_read(<addr>, <data>);</code> at the appropriate point in your test sequence.</p>
<p>In the same vein, the following task writes a data word to the given address:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">task</span> <span class="k">automatic</span> <span class="n">axi_write</span><span class="p">;</span>
<span class="k">input</span> <span class="p">[</span><span class="no">C_S_AXI_ADDR_WIDTH</span> <span class="o">-</span> <span class="mh">1</span> <span class="o">:</span> <span class="mh">0</span><span class="p">]</span> <span class="n">addr</span><span class="p">;</span>
<span class="k">input</span> <span class="p">[</span><span class="no">C_S_AXI_DATA_WIDTH</span> <span class="o">-</span> <span class="mh">1</span> <span class="o">:</span> <span class="mh">0</span><span class="p">]</span> <span class="n">data</span><span class="p">;</span>
<span class="k">begin</span>
<span class="n">s_axi_wdata</span> <span class="o">=</span> <span class="n">data</span><span class="p">;</span>
<span class="n">s_axi_awaddr</span> <span class="o">=</span> <span class="n">addr</span><span class="p">;</span>
<span class="n">s_axi_awvalid</span> <span class="o">=</span> <span class="mh">1</span><span class="p">;</span>
<span class="n">s_axi_wvalid</span> <span class="o">=</span> <span class="mh">1</span><span class="p">;</span>
<span class="k">wait</span><span class="p">(</span><span class="n">s_axi_awready</span> <span class="o">&&</span> <span class="n">s_axi_wready</span><span class="p">);</span>
<span class="p">@(</span><span class="k">posedge</span> <span class="n">s_axi_aclk</span><span class="p">)</span> <span class="p">#</span><span class="mh">1</span><span class="p">;</span>
<span class="n">s_axi_awvalid</span> <span class="o">=</span> <span class="mh">0</span><span class="p">;</span>
<span class="n">s_axi_wvalid</span> <span class="o">=</span> <span class="mh">0</span><span class="p">;</span>
<span class="k">end</span>
<span class="k">endtask</span>
</pre></div><figcaption><span>Writing a word to the AXI4-Lite bus. </span></figcaption>
</figure>
<p>As a final note, be aware that these tasks are not at all intended to verify the protocol-level implementation of the AXI interface itself. A verified boilerplate solution, such as the one auto-generated by the Xilinx tools, would be used most of the time anyway. However, it might be interesting to know that ARM offers a set of <a href="http://infocenter.arm.com/help/topic/com.arm.doc.dui0534b/DUI0534B_amba_4_axi4_protocol_assertions_ug.pdf">AXI 4 Protocol Assertion</a> cores that can be inserted into the design to verify that the bus signalling conforms to the specification.</p>
The State of LDC on Windows2013-05-31T00:00:00+01:00http://klickverbot.at/blog/2013/05/the-state-of-ldc-on-windows<p class="lead">LDC is one of the three major D compilers. It uses the same frontend as DMD, the reference implementation of the language, but leverages LLVM for optimization and code generation. While it has been stable on Linux and OS X for quite some time, support for the Windows operating system family was virtually non-existent so far. There have been substantial advances recently, and this post gives an overview of the current situation.</p>
<p>Before going on to discuss the present status, though, let me quickly answer the inevitable question: Why did it take so long? It is not that the importance of Windows as a target platform would not have been recognized by the D community (or the LDC contributors in particular). Instead, the reason for the lack on of a working Windows port was caused by the fact that LLVM itself did not support all the required operating system specific features. Notably, exception handling was not implemented at all on Windows for a long time.</p>
<p>This applies to 32-bit variants of Windows (<em>Win32</em>) as well as to the newer 64-bit operating systems (<em>Win64</em>), but interestingly the reasons for this are completely different. In the latter case, the problem was just that nobody took the time to implement the (table-driven) Win64 exception handling scheme in the LLVM backend. This is not so surprising, as most of the big companies sponsoring LLVM development are not using LLVM on Windows, or in an application domain that does not require features such as native exception handling or thread-local storage support.</p>
<p>However, Kai Nacke has tackled this problem recently, among with a number of other LLVM issues blocking development of the Visual Studio-based Win64 port of LDC. A patch fixing the bulk of the bugs in the exception handling implementation is currently under review on the LLVM development mailing list, and Kai has <a href="http://forum.dlang.org/post/vscpokspiejlckivqsuq@forum.dlang.org">prepared a binary preview version of LDC</a> with all the latest patches. For more information, you can also visit the <a href="http://wiki.dlang.org/Building_and_hacking_LDC_on_Windows_using_MSVC">Building and hacking LDC on Windows using MSVC</a> page on the LDC wiki.</p>
<p>The rest of this post will discuss the situation specifically on Win32/MinGW. Here, the root problem is that Structured Exception Handling (<em>SEH</em>), the default exception handling mechanism on 32-bit Windows, is covered by a <a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=5,628,016.PN.&OS=PN/5,628,016&RS=PN/5,628,016">Borland-held patent</a>. It will not expire until next year, and while Borland seems to dismiss any related concerns, the GCC and LLVM projects have decided to not include an implementation of SEH in their compiler backends for fear of legal trouble.</p>
<p>Recently, however, support for DWARF 2-style exception handling appeared in GCC/MinGW. Here, the Windows-“native” SEH is forgone for the same table-based exception handling scheme that is also used on Linux. The downside of this approach is obviously that it doesn’t integrate with SEH exceptions raised by the OS or other C libraries. But even if it is theoretically possible to catch those from D, this (DMD) feature isn’t really used widely, and as such virtually all D projects should be oblivious to the exception handling mechanism used under the hood.</p>
<h2 id="status-overview">Status Overview</h2>
<p>So, what can you expect from LDC on Win32/MinGW today? First, the good parts:</p>
<ul>
<li>
<p><em>Exception handling</em> works, and all the related test cases that also pass on the various Posixen also pass on Win32/MinGW. Why this qualification? Just like GDC, LDC unfortunately doesn’t implement all the fine details of D’s exception chaining mechanism on any platform yet.</p>
</li>
<li>
<p><em>Thread-local storage (TLS)</em> support is solid. Seeing this item on the list might surprise you, as TLS is central to each and every D2 application. However, it regularly turns out to be a pain point when porting D to new platforms, as it is typically not so important for other native languages. Thus, the related parts of the toolchains are typically less well tested, and LLVM on MinGW unfortunately was no exception here. At this point, however, my fixes to TLS support have arrived in the upstream versions of both mingw-w64 and LLVM, so no custom patches are required any longer (this is also the reason why LDC requires a very recent version of both).</p>
</li>
<li>
<p>The <em>DMD, druntime and Phobos</em> test suites mostly pass, and some smaller applications I tested build and work just fine. This notably includes most functionality associated with 80-bit <code>real</code>s (aka <code>long double</code>), which is notoriously problematic as the Microsoft Visual C/C++ runtime (<em>MSVCRT</em>) does not support this type of floating point numbers at all.</p>
</li>
<li>
<p>LDC is sufficiently ABI-compatible to DMD on 32-bit Windows that virtually all of the inline assembly code in druntime and Phobos works without changes. This only covers a surprisingly small part of the total ABI though, so even if DMD emitted COFF object files, it would still be a hopeless endeavor to try and link object files produced by the two compiles together, just as it is on the other operating systems.</p>
</li>
</ul>
<p>Now, for the less pleasant points:</p>
<ul>
<li>
<p>There are still a few issues related to floating-point math, particularly with complex 80-bit numbers. Single tests in <code>std.complex</code>, <code>std.math</code>, <code>std.mathspecial</code> and <code>std.internal.math.gammafunction</code>still fail, and <code>core.stdc.fenv</code> is not implemented properly yet. It seems to be likely that most of these problems are again caused by functions lacking from MSVCRT respectively their MinGW replacements (one specific example is <code>fmodl</code>, which seems to cause interesting ABI issues).</p>
</li>
<li>
<p>The <code>core.sys.windows.dll</code> tests do not build, and while this would be easy to work around, DLL creation is entirely untested at this point.</p>
</li>
<li>
<p>While MinGW theoretically supports COM, the <code>std.windows.iunknown</code> tests do not link yet because of missing symbols. There is likely an easy fix, but interfacing with COM has not been tested at all.</p>
</li>
<li>
<p>There are also still two rather disconcerting test failures in <code>core.time</code> and <code>rt.util.container</code> which have not been tracked down yet.</p>
</li>
<li>
<p>LDC currently relies on using the MinGW <code>as</code> for emitting object files, as the LLVM integrated assembler does not correctly support writing the DWARF exception handling tables yet. This is suboptimal, as it causes several issues with non-ASCII characters in symbol names and generally has a negative effect on compiler performance. It currently also causes an issue with building the <code>std.algorithm</code> unit tests in debug mode, where the humongous symbol names (in the tens of kilo(!)bytes) overflow some <code>as</code>-internal data structures.</p>
</li>
<li>
<p>And most importantly, LDC/MinGW is still virtually untested on larger real-world applications. There will certainly be a number of bugs which have not been caught by any of the test suites.</p>
</li>
</ul>
<h2 id="getting-started">Getting Started</h2>
<p>So, how to try out LDC on Windows? The easiest thing would be to just download the latest binary (preview) release. For this, first grab a <em>very recent</em> mingw32-w64 snapshot, <a href="http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/rubenvb/gcc-4.8-dw2-release/i686-w64-mingw32-gcc-dw2-4.8.0-win32_rubenvb.7z/download">such as this one</a> (<em>rubenvb</em> personal build, <em>.7z</em>, ~27 MB) and extract it to an arbitrary location. It is important that you pick one built with Dwarf 2 exception handling enabled; when in doubt, just use the above one.</p>
<p>Then, download and extract the latest <a href="http://d32gngvpvl2pi1.cloudfront.net/ldc2-0.11.0-beta3-mingw-x86.7z">LDC binary release for MinGW</a> (<em>.7z</em>, ~8.5 MB). It is a “DMD-style” package that should work from any location without any extra installation steps. Before invoking LDC, you need to make sure that the MinGW <code>bin</code> directory is on your path, though. This is easiest to achieve by starting a shell using <code>mingw32env.cmd</code> in the MinGW root directory, or of course using a MSYS shell altogether.</p>
<p>If you prefer building LDC from source yourself, a guide on <a href="http://wiki.dlang.org/Building_LDC_on_MinGW_x86">building LDC on MinGW x86</a> is available on the wiki. Any help with LDC/MinGW development would be very much appreciated!</p>
Purity in D2012-05-27T00:00:00+01:00http://klickverbot.at/blog/2012/05/purity-in-d<p class="lead">Programming language design is a controversial topic, but in light of current challenges regarding both hardware trends and maintainability, several concepts originating in the <a href="http://en.wikipedia.org/wiki/Functional_programming">functional programming</a> world are being rediscovered as universally helpful. To that end, the <a href="http://dlang.org">D programming language</a> includes its own pragmatic take on the idea of <em>functional purity</em>. This article is an introduction to D’s <code>pure</code> keyword and its interaction with other language features.</p>
<p>Purity is a powerful tool for programmer and compiler alike to help reasoning about source code. But before we delve into the implications and use cases of the feature, first a short definition of the actual semantics of <code>pure</code> in D. If you are already familiar with the concept as implemented in other languages, please pretend you never heard of it for the moment. There will likely be a subtle difference in D’s interpretation, the quite profound consequences of which will be covered later.</p>
<p><code>pure</code> is a function attribute, and represents a contract between functions and their callers: The implementation of a pure function <em>does not access global mutable state</em>, where »global« refers to anything besides the function parameters (which must not reference data <code>shared</code> between threads), and »access« covers all reading or writing operations. A function not marked <code>pure</code> is called <em>impure</em>.</p>
<p>In a slightly less precise way, this means that pure functions always have the same effect and/or return the same result for a given set of arguments. As a consequence, a pure function for example cannot call other impure functions, or perform any kind of I/O (in the classical sense).</p>
<p>However, in order to make implementing non-trivial pure functions feasible, a few things are allowed in pure code that might be illegal under a very strict definition of what comprises state (feel free to skip this if you are only interested in the »big picture«):</p>
<ul>
<li>
<p><em>Aborting the program</em>: In a systems-level language like D, there will always be ways to terminate the program. As there is really no way around this, it is explicitly allowed in the specification.</p>
</li>
<li>
<p><em>Floating point calculations</em>: On x86 processors, the behavior of floating point calculation is influenced by a number of global flags (this probably applies to other ISAs which I am less familiar with as well). Thus, if a function contains even a single, perfectly innocent x87/SSE floating point expression like <code>x + y</code> or <code>cast(int)x</code>, its result, including exceptions being thrown, can vary greatly based on global state (i.e. the processor flags).<sup class="footnote" id="fnr1"><a href="#fn1">1</a></sup> Thus, under a strict definition of purity, all floating point calculations would be disallowed. As this would be an impractical restriction, in D pure functions are allowed to read and write floating point flags (note, however, that in general D functions which change the flags are required to reset them after control flow leaves the function).</p>
</li>
</ul>
<aside>
<p>In D, non-recoverable exceptions are derived from the <code>Error</code> class. While it is still possible to catch them in non-<code>@safe</code><sup class="footnote" id="fnr2"><a href="#fn2">2</a></sup> code, any invariants normally provided by the type system are not guaranteed to hold any longer at this point.</p>
</aside>
<ul><li><p><em>Allocating garbage-collected memory</em>: If maybe not on the first look, on the second thought it should be evident that the result of an operation allocating memory (think <code>malloc</code>) fundamentally depends on global state, namely the amount of free memory available to the system. An equally valid observation, though, is that being unable to use heap-allocated memory at all is a severe restriction for many operations. But it turns out that in D, if allocating GC memory using the <code>new</code> keyword fails, it does so with a non-recoverable <code>Error</code> anyway. Thus, pure functions can use <code>new</code> without violating the guarantees the type system provides. (<em>Note: </em>Strictly speaking, even using memory from the stack would be impure, because depending on the environment, the function might end up triggering a stack overflow.)</p></li></ul>
<h2 id="what-about-referential-transparency">What About Referential Transparency?</h2>
<p>One thing is ubiquitous in the functional programming world, but conspicuously absent from the above definition: The immutability of the function parameters. This is neither an oversight, nor has it been implied – pure functions in D can alter their arguments. For example, the following snippet is perfectly valid D code:</p>
<figure class="code"> <div class="highlight"><pre><span class="kt">int</span> <span class="n">readAndIncrement</span><span class="p">(</span><span class="k">ref</span> <span class="kt">int</span> <span class="n">x</span><span class="p">)</span> <span class="k">pure</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">x</span><span class="p">++;</span>
<span class="p">}</span>
</pre></div></figure>
<p>This might be surprising to some, as purity in programming language theory typically implies referential transparency, which means that a function invocation can be replaced with its result without changing the program semantics (implying absence of side effects). However, this is not automatically the case in D. For example, this piece of code</p>
<figure class="code"> <div class="highlight"><pre><span class="kt">int</span> <span class="n">val</span> <span class="p">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">result</span> <span class="p">=</span> <span class="n">readAndIncrement</span><span class="p">(</span><span class="n">val</span><span class="p">)</span> <span class="p">*</span> <span class="n">readAndIncrement</span><span class="p">(</span><span class="n">val</span><span class="p">);</span>
<span class="c1">// assert(val == 3 && result == 2);</span>
</pre></div></figure>
<p>clearly does not give the same result if <code>readAndIncrement</code> is only evaluated once instead:</p>
<figure class="code"> <div class="highlight"><pre><span class="kt">int</span> <span class="n">val</span> <span class="p">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">tmp</span> <span class="p">=</span> <span class="n">readAndIncrement</span><span class="p">(</span><span class="n">val</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">result</span> <span class="p">=</span> <span class="n">tmp</span> <span class="p">*</span> <span class="n">tmp</span><span class="p">;</span>
<span class="c1">// assert(val == 2 && result == 1);</span>
</pre></div></figure>
<p>As covered in the next section, this behavior is actually very desirable in an imperative language, but what to do if you actually want the stronger guarantees of the classical definition of purity, and all the nice properties it entails? Here, another aspect of the D type systems comes to the rescue: the option to transitively mark a view on data as <code>const</code> or the data to be completely <code>immutable</code><sup class="footnote" id="fnr3"><a href="#fn3">3</a></sup>. For a closer look at this, consider the following three function declarations:</p>
<figure class="code"> <div class="highlight"><pre><span class="kt">int</span> <span class="n">a</span><span class="p">(</span><span class="kt">int</span><span class="p">[]</span> <span class="n">val</span><span class="p">)</span> <span class="k">pure</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">b</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="p">[]</span> <span class="n">val</span><span class="p">)</span> <span class="k">pure</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">c</span><span class="p">(</span><span class="k">immutable</span> <span class="kt">int</span><span class="p">[]</span> <span class="n">val</span><span class="p">)</span> <span class="k">pure</span><span class="p">;</span>
</pre></div></figure>
<p>Regarding <code>a</code> with its mutable parameter, the same observation as for <code>readAndIncrement</code> applies (<code>int[]</code> is a dynamic array, i.e. a pointer/length pair referring to a slice of memory). In case of <code>b</code> and <code>c</code>, though, something nice happens: Because the functions are pure, we know that they cannot read/change any global state, and the parameters are not mutable either, so <code>b</code> and <code>c</code> are side-effect free in the usual sense of the word – calls to them are referentially transparent.</p>
<p>That being said, is there a difference between <code>b</code> and <code>c</code> at all? From a purity point of view, there is none – <code>const</code> and <code>immutable</code> impose exactly the same restrictions on what the function can do with its parameters (the latter additionally provides the guarantee that the data will indeed never change, but as no references to it can be escaped besides the return value due to <code>pure</code>, this is unlikely to matter in most cases).</p>
<p>However, there is a subtle but important difference affecting the <em>calling</em> code, depending on whether the actual <em>arguments</em> to a function call are merely <code>const</code>, which both mutable and immutable values are implicitly convertible to, or <code>immutable</code> (i.e. the following applies to both <code>c</code> and <code>b</code> if called with an <code>immutable</code> array).</p>
<p>For example, consider implementing a memoization or common subexpression elimination mechanism. When coming across a <code>pure</code> function with <code>immutable</code> parameters, only the identity of the arguments has to be checked in order to be able to optimize several calls down to one, e.g. by comparing the memory addresses in the case of a runtime implementation, or by a few very simply checks in an optimizing compiler. On the other hand, if an argument type contains indirections and is only <code>const</code>, somebody else could modify the data between two calls, requiring »deep« comparisons that might not be feasible for large data structures in the runtime case, or extensive data flow analysis in a compiler.</p>
<p>The same consideration applies to parallelization: If the arguments of a pure function have no or only <code>immutable</code> indirections, it is guaranteed that it is safe to parallelize, because it can cause no side effects which could lead to non-deterministic behavior, and there can be no data races in the parameters as well. However, for <code>const</code> arguments, this cannot as easily be inferred, because another piece of code with a mutable view on the arguments could end up modifying them at the same time.</p>
<!-- Comment about explicit annotations, std.traits.ParameterTypeTuple. -->
<h2 id="indirections-in-the-return-type">Indirections in the Return Type?</h2>
<p>In the previous examples, the functions <code>a</code>, <code>b</code> and <code>c</code> differed in whether there were mutable indirections present in their arguments, but in all three cases, the return type was <code>int</code>, the archetypical example for a value type. Is there more to consider if a pure function returns a type containing references?</p>
<p>The first essential point are addresses, respectively the definition of equality applied when considering referential transparency. In functional languages, the actual memory address that some value resides at is usually of little to no importance. D being a system programming language, however, exposes this concept. Now, consider a function <code>ulong[] primes(uint count) pure</code>, which allocates an array and fills it with the first <code>count</code> prime numbers. Invoking <code>primes</code> multiple times with the same <code>count</code> will always return the same numbers, but the arrays containing the result will be allocated at different addresses. Thus, it is clear that when considering referential transparency of functions with indirections in the return value, logical equality (<code>==</code>) instead of bit-by-bit equality (<code>is</code>) is what matters.</p>
<p>The second thing important for referential transparency are mutable indirections in the return type. For example, consider the following snippet of code using the hypothetical <code>primes</code> function:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">auto</span> <span class="n">p</span> <span class="p">=</span> <span class="n">primes</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">q</span> <span class="p">=</span> <span class="n">primes</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="n">p</span><span class="p">[]</span> <span class="p">*=</span> <span class="mi">2</span><span class="p">;</span>
</pre></div></figure>
<p>Obviously, rewriting the second invocation of <code>primes</code> to <code>auto q = p</code> is not valid, because then, <code>q</code> would refer to the same slice of memory, and thus also contain twice the primes after the multiplication is executed. Generally speaking, the invocation of a pure function with mutable indirections in its return type cannot immediately be considered referentially transparent, but a number of calls might still be optimized/… as if it was depending on how the calling code uses the return values.</p>
<h2 id="weak-purity-allows-for-stronger-guarantees">›Weak‹ Purity Allows for Stronger Guarantees</h2>
<p>At this point, it should be mentioned that the initial design of the <code>pure</code> keyword in D featured a much stricter set of rules, and while the language specification only ever had a single notion of purity (as defined in the introduction), during discussion of the current more permissive design two terms were coined: <em>weakly pure</em>, referring to functions like <code>readAndIncrement</code> and <code>a</code> from the above examples which have mutable parameters, and <em>strongly pure</em> for side-effect free functions like <code>b</code> and <code>c</code>. Note, however, that there is no exact definition for these terms and their use frequently is the source of confusion in online discussions – to the point where Don Clugston, who introduced the names in his proposal for the improved design, has already asked for them not to be used any longer.</p>
<p>Still, the terms remain in use today, and the fact that this arbitrary distinction refuse to go away corroborates the observation that the amount of guarantees <code>pure</code> provides varies greatly depending on the parameter/return types. And if maybe only for the reason that it is unfamiliar – the actual rules are very simple –, the implications of the current design are sometimes poorly understood. So, what is the motivation behind allowing pure functions to modify their arguments in the first place?</p>
<p>The real power behind the D purity design is that relaxing the rules actually allows <em>more functions to be »strongly« pure</em>. To illustrate this, allow me to quote a recent <a href="http://altdevblogaday.com">#AltDevBlogADay</a> article by John Carmack (of <em>id Software</em> fame) titled »Functional Programming in C++«, a refreshingly pragmatic look at the benefits of applying some functional principles to C++ code:</p>
<blockquote>
<p>Programming with pure functions will involve more copying of data, and in some cases this clearly makes it the incorrect implementation strategy due to performance considerations. As an extreme example, you can write a pure <code>DrawTriangle()</code> function that takes a framebuffer as a parameter and returns a completely new framebuffer with the triangle drawn into it as a result. Don’t do that. — <a href="http://www.altdevblogaday.com/2012/04/26/functional-programming-in-c/">altdevblogaday.com/…/functional-programming-in-c</a></p>
</blockquote>
<p>There is nothing wrong with this statement, copying the frame buffer every time you draw a triangle is certainly not a good idea. But it turns out that in D, you can actually implement a <code>pure</code> triangle drawing function without committing performance suicide! Its signature might look something like this:<sup class="footnote" id="fnr4"><a href="#fn4">4</a></sup></p>
<figure class="code"> <div class="highlight"><pre><span class="k">alias</span> <span class="kt">ubyte</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="n">Color</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">Vertex</span> <span class="p">{</span> <span class="kt">float</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="n">position</span><span class="p">;</span> <span class="cm">/* … */</span> <span class="p">}</span>
<span class="k">alias</span> <span class="n">Vertex</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="n">Triangle</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">drawTriangle</span><span class="p">(</span><span class="n">Color</span><span class="p">[]</span> <span class="n">framebuffer</span><span class="p">,</span> <span class="k">const</span> <span class="k">ref</span> <span class="n">Triangle</span> <span class="n">tri</span><span class="p">)</span> <span class="k">pure</span><span class="p">;</span>
</pre></div></figure>
<p>This is nice in and for itself: as remarked in the above quote, <code>drawTriangle</code> cannot realistically be referentially transparent since it needs to write to the frame buffer, but <code>pure</code> still guarantees that it does not mess around with any hidden/global state. However, there is more: Being pure, the function can now be called from other pure functions. Continuing the toy example, if allocating a new buffer every frame was an option, this could be a function to render a whole scene consisting of triangles:</p>
<figure class="code"> <div class="highlight"><pre><span class="n">Color</span><span class="p">[]</span> <span class="n">renderScene</span><span class="p">(</span>
<span class="k">const</span> <span class="n">Triangle</span><span class="p">[]</span> <span class="n">triangles</span><span class="p">,</span>
<span class="kt">ushort</span> <span class="n">width</span> <span class="p">=</span> <span class="mi">640</span><span class="p">,</span>
<span class="kt">ushort</span> <span class="n">height</span> <span class="p">=</span> <span class="mi">480</span>
<span class="p">)</span> <span class="k">pure</span> <span class="p">{</span>
<span class="k">auto</span> <span class="n">image</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Color</span><span class="p">[</span><span class="n">width</span> <span class="p">*</span> <span class="n">height</span><span class="p">];</span>
<span class="k">foreach</span> <span class="p">(</span><span class="k">ref</span> <span class="n">triangle</span><span class="p">;</span> <span class="n">triangles</span><span class="p">)</span> <span class="p">{</span>
<span class="n">drawTriangle</span><span class="p">(</span><span class="n">image</span><span class="p">,</span> <span class="n">triangle</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">image</span><span class="p">;</span>
<span class="p">}</span>
</pre></div></figure>
<p>Note how the arguments of <code>renderScene</code> lack any mutable indirections – while it internally calls the argument-mutating <code>drawTriangle</code>, <code>renderScene</code> as a whole is referentially transparent!</p>
<p>Now, granted, this example might be a bit contrived, but with D unwilling to give up the bare-metal performance of imperative code, similar situations quite frequently occur in real-life code (e.g. when using any kind of mutable container in the implementation of a pure function). This is also backed by experience with the aforementioned first iteration of the purity design – relaxing the purity rules had the at first sight slightly paradoxical effect of enabling the <em>same strong guarantees</em> as before to be provided for a greatly <em>increased amount of code</em>.</p>
<p>A related observation is that most modern style guides discourage use of global state anyway, and thus, it should be possible to mark most D functions not dealing with I/O as pure. This is indeed true – so why not make <code>pure</code> the default and require functions to be explicitly marked as, say, <code>impure</code> instead? Regarding D version 2, the reason why this has not been done is simply that purity in its current form was only added at a relatively late point in the evolution of the language, where the impact of such a breaking change was simply considered to be too high. Nevertheless, this is certainly a promising direction to explore for future languages and a (hypothetical) next major release of D.</p>
<h2 id="templates-and-purity">Templates and Purity</h2>
<p>Up to this point, the focus was on the design of <code>pure</code> more or less in isolation. In the following sections, the main topic will be its interaction with other language features, with the first one being templates, or more specifically functions templates.</p>
<p>Once instantiated with its type parameters, functions template are just normal functions, so purity should just work as previously described for them as well. This is indeed the case, but there is additional complexity added because whether a function template can be pure or not might actually depend on the types it is instantiated with.</p>
<p>For an example of this, suppose you want to write a function <code>array</code> which accepts a range<sup class="footnote" id="fnr5"><a href="#fn5">5</a></sup> and returns an array containing all of its elements (this function already exists in <code>std.array</code> with a much better implementation). A first take on the problem could look somewhat like this:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">auto</span> <span class="n">array</span><span class="p">(</span><span class="n">R</span><span class="p">)(</span><span class="n">R</span> <span class="n">r</span><span class="p">)</span> <span class="k">if</span> <span class="p">(</span><span class="n">isInputRange</span><span class="p">!</span><span class="n">R</span><span class="p">)</span> <span class="p">{</span>
<span class="n">ElementType</span><span class="p">!</span><span class="n">R</span><span class="p">[]</span> <span class="n">result</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(!</span><span class="n">r</span><span class="p">.</span><span class="n">empty</span><span class="p">)</span> <span class="p">{</span>
<span class="n">result</span> <span class="p">~=</span> <span class="n">r</span><span class="p">.</span><span class="n">front</span><span class="p">;</span>
<span class="n">r</span><span class="p">.</span><span class="n">popFront</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span>
</pre></div><figcaption><span>A simple, inefficient reimplementation of <code>std.array.array</code>, which converts a range of elements into a built-in array. Could <code>pure</code> be added to this function? </span></figcaption>
</figure>
<p>It is not hard to guess what this is doing – one by one, the front element of the range is popped off and appended to the result array until there are no more elements left. But the question is now: Can this function be made <code>pure</code>? If <code>R</code> is something like the result of a <code>map</code> or <code>filter</code> operation on an array, there is no reason why the should not be callable from pure code. However, if <code>R</code> for example encapsulates a line being read from standard input, there is no way <code>r.empty</code>, <code>r.front</code> and <code>r.popFront()</code> can all be <code>pure</code>. Thus if <code>array</code> was marked <code>pure</code>, it could not operate on such ranges anymore, even if it would otherwise be perfectly able to. So, what to do?</p>
<p>One way of approaching this problem would be to introduce syntax sugar for only conditionally applying attributes to a declaration based on some predicate (which would here depend on <code>R</code>). However, this was shunned at due to the complexity and repetition it would introduce to code that really should be easy to write. The solution which was finally implemented is quite simple: Since D takes a »white-box« approach to templates anyway, meaning that in order to instantiate a template its source must be available, purity is automatically inferred by the compiler for them (along with a few similar attributes like <code>nothrow</code>).</p>
<p>For the above example, this means that a call to <code>array</code> will be callable from pure functions if the concrete range type allows it, and just be impure otherwise. Also note that explicitly specifying <code>pure</code> for template functions is still possible, and can be beneficial for documentation purposes if purity does not depend on the template arguments.</p>
<h2 id="pure-member-functions">Pure Member Functions</h2>
<p>Unsurprisingly, struct and class member functions can be <code>pure</code> as well, and exactly the same rules as for free functions apply to them – with a single addition, or rather clarification: The implicit <code>this</code> parameter is also considered a function parameter for purity semantics, which is a fancy way of saying that pure functions may access and modify member variables.</p>
<figure class="code"> <div class="highlight"><pre><span class="k">class</span> <span class="n">Foo</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">getBar</span><span class="p">()</span> <span class="k">const</span> <span class="k">pure</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">setBar</span><span class="p">(</span><span class="kt">int</span> <span class="n">bar</span><span class="p">)</span> <span class="k">pure</span> <span class="p">{</span>
<span class="k">this</span><span class="p">.</span><span class="n">bar</span> <span class="p">=</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">private</span> <span class="kt">int</span> <span class="n">bar</span><span class="p">;</span>
<span class="p">}</span>
</pre></div><figcaption><span>Pure functions are allowed to access member variables (note: typically properties would be used in place of getters/setters in D). </span></figcaption>
</figure>
<p>Also note that marking a member function <code>const</code> or <code>immutable</code> is semantically equivalent to applying the attribute to its implicit <code>this</code> parameter; i.e. the above considerations regarding mutability also apply unchanged.</p>
<p>As far as class inheritance is concerned, purity behaves just as one would expect: Generally, a member function in a subclass may require less assumptions while possibly providing more guarantees than its base class equivalent (see e.g. return type covariance). Thus, a pure function might override an impure function, but not the other way round. Actually, for convenience a function overriding a pure base class method is implicitly marked <code>pure</code> (similar to <code>virtual</code> in C++); Walter Bright recently wrote <a href="http://www.drdobbs.com/blogs/cpp/232601305">a blog post</a> about this.</p>
<h2 id="pure-and-immutable--again"><code>pure</code> and <code>immutable</code> – again?</h2>
<p>The effects of <code>const</code> and <code>immutable</code> on referential transparency have already been discussed at length. However, the guarantees of <code>pure</code> in some cases also allow to draw additional conclusions. A prominent case of this, because it is integrated with the type system, is that the return value of pure functions can in some cases be safely cast to <code>immutable</code>. For example, consider the function <code>ulong[] primes(uint n) pure</code> from above. At first, it is not obvious why the following code should compile:</p>
<figure class="code"> <div class="highlight"><pre><span class="k">immutable</span> <span class="kt">ulong</span><span class="p">[]</span> <span class="n">p</span> <span class="p">=</span> <span class="n">primes</span><span class="p">(</span><span class="mi">5</span><span class="p">);</span>
</pre></div></figure>
<p>After all, <code>immutable</code> is a guarantee that there are no mutable references to the data in question at all, but <code>primes</code> clearly returns an array of mutable values. Still, the above code compiles fine, so what is going on here? The reason why it is indeed safe to assume that no other mutable references exist to the return value of <code>primes</code> is of course the fact that it is pure: It does not take any arguments with mutable indirections, nor can it read any global mutable state, so even though the slice returned refers to mutable data, the caller can be sure that nobody else could potentially modify the data.</p>
<p>This seems to be a fairly minor detail, but it turns out to be surprisingly useful in practice, as it allows functions to be seamlessly used in a »functional-style« immutable data context, while at the same time not requiring unnecessary copies in more »traditional« pieces of code, where data might need to be mutated in-place for performance reasons.</p>
<h2 id="fine-but-where-is-the-escape-hatch">Fine, but where is the Escape Hatch?</h2>
<p>It lies in the very nature of purity that it is viral, in the sense that when writing a pure function, all code its implementation depends on must be pure as well. D’s purity rules make this compositional aspect of purity very natural, but still, sometimes the need arises to call a function that is nominally impure from <code>pure</code> code.</p>
<p>One example for this is dealing with legacy code, for example when calling a function from an external C library which meets all the criteria to be pure, but has not been marked so in the header files. The way such situations are handled is the same as in all other cases where the type system cannot prove a statement about code: By using a <code>cast</code>. More specifically, by getting a pointer to the function, adding the <code>pure</code> attribute by casting, and then calling it as usual (as any other operation which potentially subverts the type system, this is forbidden in <code>@safe</code> code). If a piece of code has to deal with lots of such »dirty« calls, introducing a short <code>assumePure</code> template which nicely encapsulates the casts might be worthwhile.</p>
<p>And then there is this other thing where purity might momentarily be a hindrance: inserting (impure) debug code into functions, for example to log some values or to take simple call statistics by bumping a global variable every time a function is invoked. Inserting such an impure statement into the innermost of a chain of pure functions would be a major annoyance, and while this style of debugging might be scoffed at by language purists, it is sometimes quite useful in practice.</p>
<p>Initially, D did not include any special provision for this use case, but a way to »temporarily disable purity« for debugging purposes was much requested. As a result, a special case was eventually added to the rules, allowing impure code in pure functions if it is inside a <code>debug</code> conditional. This solution is easy to use, and if not perfectly clean, since such code has to be explicitly enabled via a command line switch (it is <em>not</em> included in normal non<code>-release</code> builds), it is still acceptable from an aesthetical point of view.</p>
<h2 id="conclusion">Conclusion</h2>
<p>To reiterate the statement from the beginning, the importance of the concept of purity lies within the fact that it allows the type system to assert that a particular function call will not depend on or modify hidden state. We have seen that the <code>pure</code> keyword in D imposes less restrictions than in many other languages, but while the same amount of guarantees can still be given due to the interesting properties of transitive const-ness and immutability, this enables very natural interactions with other language features, and perhaps most importantly, imperative-style code.</p>
<p>Where to go for further information? The actual specification for <code>pure</code> is not very long, see the <a href="http://dlang.org/function.html">Functions</a> chapter of the language reference at <a href="http://dlang.org">dlang.org</a>. For background information about the evolution of the current design, the <a href="http://forum.dlang.org/thread/i7bp8o$6po$1@digitalmars.com">discussion started by Don Clugston</a> which led to the last big change is certainly an interesting read – the <a href="http://forum.dlang.org/group/digitalmars.D">D programming language forums</a> might also be a good place to ask specific questions about design and implementation of the concepts described here.</p>
<footer>
<p>Like what you read? <a href="http://twitter.com/?status=@klickverbot:">Let me know</a> what you think, <a href="http://twitter.com/?status=Just%20read%20»Purity%20in%20D«%20by%20@klickverbot:%20http://klickverbot.at/blog/2012/05/purity-in-d">share the article</a> on Twitter or join the discussions on <a href="http://news.ycombinator.com/item?id=4032248">Hacker News</a> and <a href="http://www.reddit.com/r/programming/comments/u84fc/purity_in_d/">Reddit</a>. Also, there is more on <a href="/blog/tags/D/" title="View all posts tagged with »D«" rel="tag">D</a>.</p>
</footer>
<p class="footnote" id="fn1" style="margin-top: 2.88em"><a href="#fnr1"><sup>1</sup></a>The consequences of this can be a lot more serious and confusing than one might think: Historically, several printer drivers on Windows modified the FPU flags when issuing a print job without changing them back afterwards. This caused quite a few programs to crash after a document was printed – the perfect case of a hard-to-debug crash occurring only on customer machines…</p>
<p class="footnote" id="fn2"><a href="#fnr2"><sup>2</sup></a>D code can be restricted to a memory safe language subset, sometimes referred to as <em>SafeD</em>. The feature can be activated on a per-function basis by applying one of three attributes: Code marked as <code>@safe</code> is guaranteed to be memory safe, and thus e.g. cannot do pointer arithmetic or use C-style memory management. <code>@system</code> code is the opposite – here the full language, including inline assembly and unsafe pointer casts is allowed. Finally, <code>@trusted</code> acts a bridge between both worlds, it contains hand-vetted interfaces to unsafe code. A typical example for the latter would be a type-safe D wrapper around a C <code>void*</code>-style API.</p>
<p class="footnote" id="fn3"><a href="#fnr3"><sup>3</sup></a>In D, as notably opposed to C++, <code>const</code> and <code>immutable</code> are <em>transitive</em>. In case of <code>const</code>, this means that everything reachable through a <code>const</code> reference automatically becomes <code>const</code> as well. For example, given <code>struct Foo { int bar; int* baz; }; void fun(const Foo* foo);</code>, in C++ <code>fun</code> is not allowed to modify <code>foo</code> itself, e.g. to set <code>foo->bar</code> to a different value, but can legally modify the value <code>foo->baz</code> points to – this is also called <em>shallow const</em>. In contrast, D features <em>deep const</em>, which means that in <code>fun</code>, <code>foo.baz</code> automatically becomes a <code>const</code> pointer to a <code>const</code> <code>int</code>, disallowing modifications to <code>*foo.baz</code> as well. The same rules apply to <code>immutable</code>, except that it additionally guarantees that no mutable view on the data exists at all, i.e. that not only <code>fun</code> does not modify its parameter, but nobody else ever does so (you could imagine <code>immutable</code> values to be stored in some kind of ROM). <code>immutable</code> implies <code>const</code>.</p>
<p class="footnote" id="fn4"><a href="#fnr4"><sup>4</sup></a>This example was picked for its illustrative qualities, but admittedly would probably only work like this for a simple software rasterizer. Besides the question of whether purity is much of a benefit here, if an actual graphics API was used to implement it, extra thought would have to be put into how to handle the GPU state in a pure manner.</p>
<p class="footnote" id="fn5"><a href="#fnr5"><sup>5</sup></a> Just as C++ iterators are a generalization of pointers, D ranges generalize the notion of an array or a slice of data. In its most basic form, a range offers three primitives, <code>empty</code>, <code>front</code> and <code>popFront</code>. This interface is completely oblivious to how the underlying data is stored – it could come from a chunk of memory as well as from a network transport or the standard input –, and provides an easy to use, yet powerful abstraction for algorithms to work on.</p>
Thrift now officially supports D!2012-03-27T00:00:00+01:00http://klickverbot.at/blog/2012/03/thrift-now-officially-supports-d<p class="lead"><a href="http://thrift.apache.org">Thrift</a> is a cross-language serialization and RPC framework, originally developed for internal use at Facebook, and now an <a href="http://apache.org">Apache Software Foundation</a> project. I started working on support for the <a href="http://dlang.org">D programming language</a> during <a href="http://www.klickverbot.at/code/gsoc/thrift/">Google Summer of Code 2011</a>, and at the end of last week, the implementation was finally incorporated into the main project.</p>
<p>First, let me thank Jake Farrell and everybody else on the Thrift team who was involved in <a href="https://issues.apache.org/jira/browse/THRIFT-1500">THRIFT-1500</a>; reviewing a ~719 kB patch certainly isn’t an easy thing to do. But now that the work is in, what can you (as a Thrift user) expect from the implementation?</p>
<p>Feature-wise, the library should roughly be up to par with the other major implementations (i.e. C++ and Java):</p>
<ul>
<li><p><em>Protocols:</em> Binary, Compact and JSON. The Dense protocol has not been implemented yet – it is only supported by the C++ implementation and I am not sure about its relevance nowadays (but if you are at a certain well-known company and it turns out that you still need the feature for new projects, let me know; adding support for it should not be hard).</p></li>
<li><p><em>Transports:</em> Socket, SSL, HTTP and log file reader/writer implementations (plus your familiar helpers, i.e. buffered/framed/memory-buffer/piped/zlib...)</p></li>
<li><p><em>Servers:</em> several single- and multithreaded variants (including a libevent-based non-blocking implementation)</p></li>
<li><p><em>Clients:</em> Both a synchronous and an asynchronous (future-based interface with one or more libevent-backed worker threads) implementation are provided. Additionally, several pooling implementations for redundancy as well as aggregation use cases are available.</p></li>
</ul>
<p>The implementation makes heavy use of D’s metaprogramming capatibilties and is also able to work without code generated off-line from <code>.thrift</code> files, if so desired. There are also a few experimental gimmicks, such as the capatibility to generate Thrift IDL files from existing D types at compile time. Soon to come:</p>
<ul>
<li><p><em>Unix domain sockets: </em> Currently, the D implementation only supports IPv4 and IPv6 TCP sockets, because that is what the D standard library does, but starting with the next release, it will also support Unix domain sockets (if really needed, the lack of support in `std.socket` could be worked around without much effort, though).</p></li>
<li><p><code>@safe</code><em>-ty annotations: </em> The D language features built-in memory safety annotations. The majority of the methods in the D Thrift library should be memory safe (except for e.g. `TTransport.borrow`), so marking it as such allowed Thrift to be used in D programs where safety is enforced without requiring the user to mark the Thrift calls as `@trusted`.</p></li>
</ul>
<p>So, how to get started? As said above, the source code has been mergerd from my <a href="https://github.com/dnadlinger/thrift">personal GitHub repo</a> to the <code>trunk</code> of the <a href="http://thrift.apache.org/developers/">main ASF repo</a>, and as soon as the currently ongoing rework of the official Thrift site is completed, the <a href="https://github.com/dnadlinger/thrift/wiki/Getting-Started-with-Thrift-and-D">Getting Started with Thrift and D</a> and <a href="https://github.com/dnadlinger/thrift/wiki/Building-Thrift-D-on-Windows">Building Thrift/D on Windows</a> pages will follow along. A recent build of the <a href="http://www.klickverbot.at/code/gsoc/thrift/docs/">API docs</a> is currently available here on my website. If you find any bugs, be sure to file them at the <a href="https://issues.apache.org/jira/browse/THRIFT">Thrift JIRA</a>.</p>
getaddrinfo cross-platform edge case behavior2012-01-31T00:00:00+00:00http://klickverbot.at/blog/2012/01/getaddrinfo-edge-case-behavior-on-windows-linux-and-osx<p>An often-needed piece of functionality in network programming is to resolve human-readable host or port names to their numerical equivalent, for example in order to pass the latter to operating system socket APIs. The <code>getaddrinfo</code> function fills this role on POSIX and Windows. Apart from some flags, it accepts two string parameters for host and service (port) names and returns a list of corresponding IP addresses and port numbers, superseding the older <code>gethostbyname</code> and <code>getservbyname</code> functions.</p>
<p>Either of its string parameters is allowed to be <code>null</code>, representing the local host/all interfaces (depending on whether <code>AI_PASSIVE</code> is specified) and an automatically assigned port, respectively. Both parameters being <code>null</code> at the same time, however, is disallowed by the specification, and leads to a <code>EAI_NONAME</code> error on Posix or <code>WSAHOST_NOT_FOUND</code> on Windows. What happens if the strings are empty (<code>"\0"</code>) instead of <code>null</code>, however, is left open by RFC 2553, and not really mentioned in the operating system API documentation either.</p>
<p>It turns out that there are quite a few differences there between the various operating systems, which is obviously likely to cause issues for <a href="http://winehq.org">Wine</a> (an implementation of the Windows API on Posix/X systems). To get a clear understanding of how the different cases are handled, I put together a little <a href="http://dlang.org">D</a> program which tests a few combinations of host name, port, and flag parameters (see end of post). The snippet could be written in C just the same, as <code>getAddressInfo</code> directly maps to <code>getaddrinfo</code>, I just chose D to avoid platform dependencies and writing an unduly large amount of more boilerplate code.</p>
<p>The results are summarized in the following table, where »loopback« means that the IP addresses returned were <code>127.0.0.1</code> and <code>::1</code>, »catchall« refers to <code>0.0.0.0</code> and <code>::</code>, »public« means that the actual IP addresses of all available network interfaces were returned, and <code>NONAME</code> refers to a lookup error. »hostname« means that the actual fully qualified name of the host that ran the test was used (care: the host part of the FQDN only does usually <em>not</em> resolve on OS X).</p>
<figure>
<table>
<thead>
<tr>
<th>Host</th>
<th>Port</th>
<th>Flags</th>
<th>Windows</th>
<th>Linux</th>
<th>OS X</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><code>null</code></td>
<td><code>null</code></td>
<td>-</td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr>
<td> </td>
<td><code>""</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback</td>
<td><code>NONAME</code></td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>catchall</td>
<td>catchall</td>
<td><code>NONAME</code></td>
</tr>
<tr class="odd">
<td> </td>
<td><code>"0"</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback</td>
<td>loopback</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>catchall</td>
<td>catchall</td>
<td>catchall</td>
</tr>
<tr>
<td> </td>
<td><code>"80"</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback</td>
<td>loopback</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>catchall</td>
<td>catchall</td>
<td>catchall</td>
</tr>
<tr class="odd">
<td><code>""</code></td>
<td><code>null</code></td>
<td>-</td>
<td>public</td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr>
<td> </td>
<td><code>""</code></td>
<td>-</td>
<td>public</td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td><code>NONAME</code></td>
<td><code>NONAME</code></td>
</tr>
<tr class="odd">
<td> </td>
<td><code>"0"</code></td>
<td>-</td>
<td>public</td>
<td><code>NONAME</code></td>
<td>loopback</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td><code>NONAME</code></td>
<td>catchall</td>
</tr>
<tr>
<td> </td>
<td><code>"80"</code></td>
<td>-</td>
<td>public</td>
<td><code>NONAME</code></td>
<td>loopback</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td><code>NONAME</code></td>
<td>catchall</td>
</tr>
<tr class="odd">
<td><code>"localhost"</code></td>
<td><code>null</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr>
<td> </td>
<td><code>""</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr class="odd">
<td> </td>
<td><code>"0"</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr>
<td> </td>
<td><code>"80"</code></td>
<td>-</td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>loopback</td>
<td>loopback (v4)</td>
<td>loopback</td>
</tr>
<tr class="odd">
<td>hostname</td>
<td><code>null</code></td>
<td>-</td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr>
<td> </td>
<td><code>""</code></td>
<td>-</td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr class="odd">
<td> </td>
<td><code>"0"</code></td>
<td>-</td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr class="odd">
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr>
<td> </td>
<td><code>"80"</code></td>
<td>-</td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
<tr>
<td colspan="2"> </td>
<td><code>AI_PASSIVE</code></td>
<td>public</td>
<td>loopback (v4)</td>
<td>public</td>
</tr>
</tbody>
</table>
<figcaption><code>getaddrinfo()</code> behavior on Windows Server 2008 R2, Arch Linux (Kernel 3.1.4, glibc 2.14.1), and OS X 10.7.2 (Lion).</figcaption>
</figure>
<p>What caused me to investigate the issue in the first place is the behavior when given an empty, non-null host string: Windows returns the public addresses of the present interfaces, while OS X resolves them to <code>::1</code>/<code>::</code>, but only if a port is given, and Linux doesn’t resolve them at all! Windows generally accepts the most combinations, returning an error only for the explicitly disallowed combination, which is relied on by some applications (e.g. the game <em>League of Legends</em>).</p>
<p>There were also some less significant differences in behavior which are mostly not listed in the table. First, in both of the Linux VMs I tried (an up-to-date Arch box and Ubuntu Oneric), only the IPv4 address of the loopback interface was returned. Second, as in the test no address family, socket type or protocol hints were passed to <code>getaddrinfo()</code>, each address was returned twice on OS X, once with <code>SOCK_STREAM</code>/<code>IPPROTO_TCP</code> and once with <code>SOCK_DGRAM</code>/<code>IPPROTO_UDP</code> set. Linux returned three copies of each address, for <code>STREAM</code>, <code>DGRAM</code> and <code>RAW</code>, with the according protocol types set, whereas Windows only returned a single copy with protocol type <code>IPPROTO_IP</code> and socket type set to 0.</p>
<p>In any case, as a result I have prepared a patch for Wine to emulate at least the succeeding/failing behavior of the Winsock incarnation of <code>getaddrinfo</code> on Linux and OS X, which should solve the bigger part of the related problems. There ideally shouldn’t be any Windows software relying on details beyond that (such as the actual number/layout of addresses returned), but who knows…</p>
<figure class="code"> <div class="highlight"><pre><span class="k">import</span> <span class="n">std</span><span class="p">.</span><span class="n">algorithm</span><span class="p">,</span> <span class="n">std</span><span class="p">.</span><span class="n">conv</span><span class="p">,</span> <span class="n">std</span><span class="p">.</span><span class="n">range</span><span class="p">,</span> <span class="n">std</span><span class="p">.</span><span class="n">socket</span><span class="p">,</span> <span class="n">std</span><span class="p">.</span><span class="n">stdio</span><span class="p">;</span>
<span class="k">alias</span> <span class="n">AIF</span> <span class="p">=</span> <span class="n">AddressInfoFlags</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">foreach</span> <span class="p">(</span><span class="n">host</span><span class="p">;</span> <span class="p">[</span><span class="kc">null</span><span class="p">,</span> <span class="s">""</span><span class="p">,</span> <span class="s">"localhost"</span><span class="p">,</span> <span class="n">Socket</span><span class="p">.</span><span class="n">hostName</span><span class="p">()])</span>
<span class="k">foreach</span> <span class="p">(</span><span class="n">port</span><span class="p">;</span> <span class="p">[</span><span class="kc">null</span><span class="p">,</span> <span class="s">""</span><span class="p">,</span> <span class="s">"0"</span><span class="p">,</span> <span class="s">"80"</span><span class="p">])</span>
<span class="k">foreach</span> <span class="p">(</span><span class="n">flags</span><span class="p">;</span> <span class="p">[</span><span class="k">cast</span><span class="p">(</span><span class="n">AIF</span><span class="p">)</span><span class="mi">0</span><span class="p">,</span> <span class="n">AIF</span><span class="p">.</span><span class="n">PASSIVE</span><span class="p">])</span> <span class="p">{</span>
<span class="n">write</span><span class="p">(</span>
<span class="n">host</span> <span class="p">?</span> <span class="s">"'"</span> <span class="p">~</span> <span class="n">host</span> <span class="p">~</span> <span class="s">"'"</span> <span class="p">:</span> <span class="s">"null"</span><span class="p">,</span> <span class="s">":"</span><span class="p">,</span>
<span class="n">port</span> <span class="p">?</span> <span class="s">"'"</span> <span class="p">~</span> <span class="n">port</span> <span class="p">~</span> <span class="s">"'"</span> <span class="p">:</span> <span class="s">"null"</span><span class="p">,</span> <span class="s">" ("</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="s">"): "</span>
<span class="p">);</span>
<span class="k">try</span> <span class="p">{</span>
<span class="n">getAddressInfo</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="p">,</span> <span class="n">flags</span><span class="p">)</span>
<span class="p">.</span><span class="n">sort</span><span class="p">!((</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="p">=></span> <span class="n">a</span><span class="p">.</span><span class="n">family</span> <span class="p"><</span> <span class="n">b</span><span class="p">.</span><span class="n">family</span><span class="p">)</span>
<span class="p">.</span><span class="n">map</span><span class="p">!(</span><span class="n">a</span> <span class="p">=></span> <span class="n">text</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">address</span><span class="p">,</span> <span class="s">" ("</span><span class="p">,</span> <span class="n">a</span><span class="p">.</span><span class="n">protocol</span><span class="p">,</span> <span class="s">")"</span><span class="p">))</span>
<span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="s">", "</span><span class="p">)</span>
<span class="p">.</span><span class="n">writeln</span><span class="p">;</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">e</span><span class="p">)</span> <span class="p">{</span>
<span class="n">writefln</span><span class="p">(</span><span class="s">"[%s]"</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">msg</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div><figcaption><span>D program used for gathering the data (longer than necessary for somewhat nicely formatted output). </span></figcaption>
</figure>
D/Thrift: Performance and other random things2011-08-01T00:00:00+01:00http://klickverbot.at/blog/2011/08/d-thrift-gsoc-performance-and-other-random-things<p>This week, I will try to keep the post short, while still informative – I spent way too much time being unproductive due to hard to track down bugs already to be in the mood for writing up extensive ramblings. So, on to the meat of the recent changes (besides the usual little cleanup commits here and there):</p>
<ul>
<li>
<p><em>Async client design</em>: Yes, even though it took me quite some time to come up with the original one, I had completely missed the fact that it would be unreasonably difficult to extend the support code with resource types other than sockets – long story short, <code>TAsyncSocketManager</code> now inherits from <code>TAsyncManager</code>, instead of being a part of it. Also, I split <code>TFuture</code> into two parts, a <code>TFuture</code> interface for accessing the result, and a <code>TPromise</code> implementation for actually setting/storing it, and only the <code>TFuture</code> part is returned from the async client methods. The <a href="/code/gsoc/thrift/docs/thrift.async.base.html">thrift.async docs</a> are actually useful now.</p>
</li>
<li>
<p><em>Async socket timeouts</em>: Correctly handling the state of the connection after a <code>read</code>/<code>write</code> timeout turned out to be a surprisingly tough problem to solve (allowing other request to be executed on the same connection after a timeout could lead to strange results). In the end, I settled for just closing the connection, which is a simple yet effective solution. To correctly implement this, I also had to finally kill the <code>TTransport.isOpen</code> related contracts and replace them with exceptions in the right places, leading to modified/clarified <em><code>isOpen</code> semantics</em>.</p>
</li>
<li>
<p>The <em>non-blocking server</em> now handles one-way calls correctly, and modifying the task pool after it is running no longer leads to undefined results. In the process, I have also turned the static <code>event</code> struct allocations into dynamic ones, since this should have no measurable performance impact, but removes the dependence on the (unstable, per the <code>libevent</code> docs) struct layout.</p>
</li>
<li>
<p>D now also has a <code>TPipedTransport</code>, which forwards a copy of all data read/written to another transport, useful e.g. for logging requests/responses to disk.</p>
</li>
<li>
<p>The biggest chunk of time was actually spent on <em>performance investigations</em>: While I was pretty certain that the D serialization code should not perform any worse than its C++ counterpart already, the difference in speed merely being compiler-dependent, I wanted to prove this fact so that I could cross this item from the list. This involved updating <a href="http://dsource.org/projects/ldc">LDC</a> to the 2.054 frontend (only to discover that Alexey Prokhin decided to start work on it at the same time I did, the related commits in the <a href="https://bitbucket.org/lindquist/ldc">main repository</a> are his now), fixing some LDC-specific druntime bugs, etc<sup class="footnote" id="fnr1"><a href="#fn1">1</a></sup>. Unfortunately, I couldn’t test GDC because of <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6411">issue 6411</a>, but without further ado, here are the results:</p>
</li>
</ul>
<figure>
<table class="firstname">
<thead>
<tr>
<th> </th>
<th>Writing / kHz</th>
<th>Reading / kHz</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>DMD v2.054, -O -release -inline</td>
<td>2 051</td>
<td>1 030</td>
</tr>
<tr>
<td>GCC 4.6.1 (C++), -O2, templates</td>
<td>5 667</td>
<td>1 050</td>
</tr>
<tr class="odd">
<td>LDC, -O3 -release</td>
<td>2 300</td>
<td>1 077</td>
</tr>
<tr>
<td>LDC, -output-ll / opt -O3</td>
<td>5 500</td>
<td>3 150</td>
</tr>
<tr class="odd">
<td>LDC, -output-ll / opt -std-compile-opts</td>
<td>6 700</td>
<td>1 950</td>
</tr>
</tbody>
</table>
</figure>
<p>At this point, I will disregard my earlier resolution and again get into the nitty-gritty details – the rest of this post can easily be summarized as <em>the D version is indeed up to par with C++, when it is equally well optimized</em>, but if you are curious about the details, read on.</p>
<p>If you read the performance figures from my last post, the first thing you will probably notice is that the C++ reading performance figure is about four times lower now. This isn’t a mistake; noting the comparatively slim advantage of the C++ version, I made a <a href="https://github.com/dnadlinger/thrift/commit/e7ab6c3b14b31c0241a1d37e674d3fefcbb53276">change</a> to it quite some time ago, which avoids allocating a new <code>TMemoryBuffer</code> instance on every loop iteration (the D version also reuses it). Without really considering the implications, though, I also moved the construction of the <code>OneOfEach</code> struct out of the loop. This seemed like a minor detail to me, but in fact, it enabled reuse of the <code>std::string</code>-internal buffers for the string members of the struct, which is unrealistic (e.g. for a pretty similar situation in the non-blocking server, there is no buffer reuse possible as well).</p>
<p>In a situation where a big part of the time is spent actually allocating and copying around memory, this makes a big difference. To test this assumption about the big influence of memory allocations, I compiled a version of the D benchmark where a static buffer for the strings was used instead of reallocating them every time, and indeed, the reading performance was more than twice as high.</p>
<p>The <code>std::string</code> implementation of the GCC STL seems to be fairly inefficient in this case, because the best D result (which uses GC-allocated memory), is almost three times faster than it for the reading part. It is possible that there are some further optimizations which could improve performance (<code>-O3</code> didn’t change things for the better, in case you are wondering), but as my goal wasn’t to squeeze every last bit of performance out of this synthetic benchmark, I didn’t investigate this issue any further.</p>
<p>But now to the D results: Simply switching to LDC 2 instead of DMD didn’t give any great speedups, because <code>readAll()</code> wasn’t inlined by it either, thus leaving all the memory copying unoptimized, as discussed in the last post. To see how much of a difference this would really make, I compiled the D code to LLVM IR files and manually ran the optimizer/code generator/linker on them, with the plan being to manually add the <code>alwaysinline</code> attribute to the relevant pieces of IR:</p>
<figure><pre><code>ldc2 -c -output-ll -oq -w -release -I../src -Igen-d ….d
llvm-link *.ll -o benchmark.bc
opt {-O3, -std-compile-opts} benchmark.bc -o benchmark_opt.bc
llvm-ld -native -llphobos2 -ldl -lm -lrt benchmark_opt.bc
</code></pre></figure>
<p>I then discovered that the method calls in question were properly inlined by the stand-alone <code>opt</code> without any manual intervention anyway. I am not really sure why this happens; the inliner cost limits could be more liberal in this case, or the optimization passes being scheduled in a different way than inside LDC could have an impact, or maybe it’s connected to the fact that <code>TMemoryBuffer</code> and the caller are in different modules (to my understanding, LTO <em>shouldn’t</em> be required to optimize in this case, but it may well be that I am mistaken here).</p>
<p>The <code>LDC -output-ll</code> rows in the above table correspond to the benchmark compiled this way, with the <code>-std-compile-opts</code> and <code>-O3</code> flags passed to <code>opt</code>, respectively. This is a nice example of how important compiler optimizations for this, again, synthetic benchmark really are: for the reading part of the benchmark, <code>-O3</code> gives a nice speed boost because of the more aggressive inlining (<code>-std-compile-opts</code> doesn’t touch <code>TBinaryProtocol.readFieldBegin()</code>, which is called 15 times per loop iteration and contains some code that can completely be optimized out), but for the writing part, its result is actually <em>slower</em>, presumably because of locality effects (the call graphs are identical).</p>
<p>The only change related to benchmark performance I made since the last post was an LDC-specific workaround to stop manifest constants from incorrectly being leaked from the CTFE codegen process into the writing functions. I think the above results are justification enough to stop worrying about raw serialization performance – the results when using the Compact instead of the Binary protocol are similar – and moving on to more important topics<sup class="footnote" id="fnr2"><a href="#fn2">2</a></sup>.</p>
<p class="footnote" id="fn1"><a href="#fnr1"><sup>1</sup></a> <s>If you are curious about LDC 2, you can get the source I used from the <a href="https://bitbucket.org/lindquist/ldc">official hg repo</a>, and the LDC-specific <a href="https://github.com/dnadlinger/druntime/tree/ldc2">druntime</a> and <a href="https://github.com/dnadlinger/phobos/tree/ldc2">Phobos</a> source from my clones at GitHub</s>. LDC is <a href="https://github.com/ldc-developers/ldc">officially on GitHub</a> now.</p>
<p class="footnote" id="fn2"><a href="#fnr2"><sup>2</sup></a> Such as performance-testing the actual server implementations, but I don't expect any big surprises there, and I am not sure how to reliably benchmark the network-related code – running server and clients on the same machine is probably a bad idea?</p>
D/Thrift: Non-Blocking Server, Async Client, and more2011-07-15T00:00:00+01:00http://klickverbot.at/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more<p>First of all, the usual apologies for publishing this post later than I originally planned to. No, seriously, drafting a solid asynchronous client implementation ended up being a lot more work than I originally anticipated, but I wanted to discuss my ideas in this status report. Now, the post turned out way too large anyway, but I guess that’s what I deserve. ;)</p>
<p>Also, a quick notice beforehand: A week ago, DMD 2.054 was released. It is the first version to include, amongst a wealth of other improvements, Don’s necessary CTFE fixes and my <code>std.socket</code> additions. This means that it is no longer necessary to use a Git build to use Thrift with D, you can just go to <a href="http://www.digitalmars.com/d/download.html">digitalmars.com</a> and fetch the latest package for your OS.</p>
<h2 id="small-but-useful-additions">Small but useful additions</h2>
<p>But before discussing the intricacies of non-blocking I/O, to the mundane helper transports that found their way into the D library: The first addition was a simple <code>TInputRangeTransport</code> which, as the name says, just reads data from a generic <code>ubyte</code> input range, with some optimizations for the case where the source is a plain <code>ubyte[]</code> (<code>std.algorithm.put</code> is currently unnecessarily slow if both ranges are sliceable, I didn’t have time to prepare a fix for Phobos yet). It can e.g. be used in cases where want to deserialize some data from a memory buffer, and don’t need to write anything back (which is where <code>TMemoryBuffer</code> would be used).</p>
<p>Another addition is <code>TZlibTransport</code>, which wraps another transport to compress (deflate) data before writing it to the underlying transport, and decompress (inflate) it after reading. This is implemented by directly using zlib (via the C interface) instead of using <code>std.zlib</code>, because the API of the latter would have made it impossible to avoid needlessly allocating buffers all the time. Thankfully, the C++ library already included a zlib-based implementation, saving me from working out the various corner cases.</p>
<h2 id="some-deserialization-micro-optimizations">Some deserialization micro-optimizations</h2>
<p>The next thing I worked on were some further optimizations motivated the <code>serialization_benchmark</code>. To recapitulate, it is a <a href="https://github.com/dnadlinger/thrift/blob/d-gsoc/lib/d/test/serialization_benchmark.d">trivially simply application</a> which just serializes a struct (<code>OneOfEach</code> from <code>DebugProtoTest.thrift</code> to be precise) to a <code>TMemoryBuffer</code> and then reads the data back into the struct again, repeating both parts a number of times to be able to get meaningful timing results. Here are my related changes:</p>
<ul>
<li>
<p>First, I replaced <code>TMemoryBuffer</code> with the new <code>TInputRangeTransport</code> to avoid copying the data on each iteration of the reading loop. Because the initial copying to the memory buffer took only ~1–2% of the overall time anyway, this didn’t have a great speed impact.</p>
</li>
<li>
<p>The next change was to provide a shortcut version of <code>TTransport.readAll()</code> for <code>TInputRangeTransport</code> (and <code>TMemoryBuffer</code> as well). Previously, the generic <code>TBaseTransport</code> version which just calls <code>read()</code> in a loop was used – because the method is called about 50 times per reading loop iteration, replacing it with a simple slice assignment gave a ~20% speedup on the reading part of the serialization benchmark.</p>
</li>
<li>
<p>Furthermore, I nuked the protocol-level »read length« limit implemented for the Binary and Compact protocols. This was not much from an optimization perspective as simply due to the fact that limiting the total amount of data read really belongs at the transport level in my eyes (it was only present because of a, uhm, misguided attempt to draw inspiration from the Java library). Incidentally, this gave another ~15% speedup in the reading benchmark. I will add support for limiting the container and string size Really Soon™ (just as for C++, to be able to somehow cap the amount of memory allocated due to network input), but one more branch per container/string read should have a negligible performance impact.</p>
</li>
<li>
<p>Finally, I removed a few instances where memory was unnecessarily zero-initialized (only to be completely overwritten later) in the reading code. For the integer buffers (used for byte order conversion) this gave a small but measurable (<5%) performance boost, and for the binary/string reading (which is both larger in size and exercised more often during the benchmark) another ~8% speedup.</p>
</li>
</ul>
<h2 id="profiling-results">Profiling results</h2>
<p>So, after all these (de)serialization micro-optimizations (I improved the writing part when first working on performance), how does the D implementation compare to its natural competitor, the C++ one? Well, frankly not too well at this point. Before discussing my findings in more detail, the performance results as measured on an x86_64 Arch Linux VM<sup class="footnote" id="fnr1"><a href="#fn1">1</a></sup>, hosted on my MacBook Pro (Intel Core i7-620M 2.66 GHz, OS X 10.6), by running each part 10 000 000 times and averaging over it (the results are in 1 000 operations per second, so both implementations can perform on the order of a million reads/writes per second):</p>
<figure>
<table class="firstname">
<thead>
<tr>
<th> </th>
<th>Writing / kHz</th>
<th>Reading / kHz</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>DMD v2.054, -O -release -inline</td>
<td>2 051</td>
<td>1 170</td>
</tr>
<tr>
<td>GCC 4.6.1, -O2</td>
<td>4 624</td>
<td>2 053</td>
</tr>
<tr class="odd">
<td>GCC 4.6.1, -O2, templates</td>
<td>5 667</td>
<td>4 509</td>
</tr>
</tbody>
</table>
</figure>
<p>The first GCC row shows the result of the vanilla build (what you get by simply doing <code>cd lib/cpp/test; make Benchmark; ./Benchmark</code>), while for the »templates« row, I added the (undocumented?) <code>templates</code> flag to the generator invocation (<code>thrift -gen cpp:templates</code>), which causes the struct reading/writing methods to be templated on the actual protocol type, much like what I implemented for D. In this benchmark, eliminating any indirections naturally has a huge impact on the performance.</p>
<p>So, why has the D version less than half the throughput for writing, and is almost four times slower on reading? Let me first point out that the actual code for the C++ and D implementations is, from a semantic point of view, virtually the same (with the exception of D using garbage collected memory for <code>string</code>/<code>binary</code> data). I think I have arrived at a point where the single largest factor influencing the performance of the serialization code is the compiler used, or to be more exact, how well it optimizes the code.</p>
<p>What follows are a few result from my profiling sessions (Valgrind 3.6.1, visualized using KCachegrind<sup class="footnote" id="fnr2"><a href="#fn2">2</a></sup>) which corroborate with my assumption that compiler optimizations are the culprit here. Let’s first have a look at the profiler results for the reading part of the benchmark (this time, the loops were run only a million times each):</p>
<figure class="bigimg"><img alt="KCachegrind showing profiling results for the reading part of the C++ benchmark." src="/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more/cpp-readonly.png" /><figcaption>C++ reading time profile.</figcaption></figure>
<figure class="bigimg"><img alt="KCachegrind showing profiling results for the reading part of the D benchmark." src="/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more/d-readonly.png" /><figcaption>D reading time profile.</figcaption></figure>
<p>I only included the top six functions (by time spent in them) here for the sake of brevity, but for both implementations, the »long tail« of calls in the flat profile are actually runtime helper functions, mostly startup initialization code and memory management-related things used for reading the string functions (for D, GC calls show up prominently, because the benchmark allocates three million strings, which triggers almost 50 collections in between).</p>
<p>This also means that the compiler has done a pretty good job at combining all tiny deserialization functions into the top-level struct reading function by inlining – with one glaring difference: DMD chose not to inline <code>TInputRangeTransport.readAll()</code>, which is ultimately called when deserializing each and every member to read the actual bytes off the wire (or in this case, from memory), thus yielding to 49 million additional function calls. To make matters worse, this also means that the number of bytes requested each time (e.g. 4 for an integer) is not known at compile time, which also means that the generic <code>memcpy</code> implementation has to be called each time. On the other hand, the C++ implementation only calls <code>memcpy</code> in those situations where the number of bytes copied really depends on a runtime value, which is the case for strings which are intrinsically variable-length (the other memcpy calls are called during initialization and initially writing the struct to the buffer).</p>
<p>Profiling the writing part shows similar results:</p>
<figure class="bigimg"><img alt="KCachegrind showing profiling results for the writing part of the C++ benchmark." src="/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more/cpp-writeonly.png" /><figcaption>C++ writing time profile.</figcaption></figure>
<figure class="bigimg"><img alt="KCachegrind showing profiling results for the writing part of the D benchmark." src="/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more/d-writeonly.png" /><figcaption>D writing time profile.</figcaption></figure>
<p>Again, for the C++ version, everything is inlined into <code>OneOfEach.write()</code>, in which over 80% of the time are actually spent, and just as for the reading part, the only instance where <code>memcpy()</code> is not inlined<sup class="footnote" id="fnr3"><a href="#fn3">3</a></sup> is for strings. On the other hand, the D version is optimized <em>almost</em> as well as the C++ version, with the only exception of <code>TMemoryBuffer.write()</code> not being inlined, which again prevents <code>memcpy</code> from being optimized (the other function showing up, <code>reset()</code>, only resets output buffer once per iteration, this is inlined into <code>main</code> in the C++ version).</p>
<p>So, to recapitulate, I am not sure whether DMD would be able to replace a <code>memcpy()</code> call with optimized asm in the first place, but not knowing the length at compile-time prevents that anyway. I am pretty sure that this difference of about a hundred million function calls and not being able to write optimized text for the short (2, 4, 8, …) byte copies accounts for a large part of the performance gap.</p>
<p>This assumption is supported by data gathered from a case where GCC chose to not inline <code>TBufferBase::write()</code> (which is the common path of <code>TMemoryBuffer::write()</code>). Interestingly, this actually happens at <code>-O3</code>, which is a <em>higher</em> optimization level than <code>-O2</code> used above (I suppose because of some additional optimizations performed on it, which causes its inlining costs to rise high enough not to be inlined). Just for comparison, here are again the five top functions from the profile:</p>
<figure class="bigimg"><img alt="KCachegrind showing profiling results for the writing part of the C++ benchmark compiled with -O3." src="/blog/2011/07/d-thrift-gsoc-nonblocking-server-async-client-and-more/cpp-writeonly-o3.png" /><figcaption>C++ writing time profile when compiled with <code>-O3</code>, causing <code>TBufferBase::write()</code> to be no longer inlined.</figcaption></figure>
<p>Just as for D, because of this <code>memcpy</code> cannot be optimized away either. And unsurprisingly, this causes the performance to go through the floor as well, the executable only reaches 2 519 thousand operations per second now. The D version is still a bit slower with only 2 051 kHz, but it is on a comparable level now.</p>
<p>So, to finally come to a conclusion, most of the performance gap between C++ and D presumedly comes from DMD not inlining a key function and thus not being able to optimize away <code>memcpy</code> calls as well. An obvious experiment would be to try a different compiler like GDC or LDC, both of which are known to generally optimize better than DMD does. Unfortunately, both of them are currently at front-end version 2.052, but my Thrift code currently requires 2.054.</p>
<p>There are two possible solutions to this, either sprinkle workarounds all over the Thrift code to be able to use the older DMD frontend and Phobos versions for the benchmark, or update the frontend of GDC or LDC to 2.054. While the former would be entirely feasible, I think I update the LDC frontend once I have some time to spare, as this will also be useful for other D projects (choosing LDC because I am already familiar with its codebase).</p>
<h2 id="libevent-based-non-blocking-server">Libevent-based non-blocking server</h2>
<p>If I didn’t lose you during all the talk about micro-optimization above, let me hereby present you the two main additions to the library during the last two weeks: a <em>non-blocking server implementation</em> and a Future-based <em>asynchronous client</em> interface.</p>
<p>I am not sure if I ever stated it explicitly (the timeline only has »event-based I/O Phobos lib?« in parentheses), but I was hoping to be able to come up with a small general-purpose non-blocking I/O library written in D as a by-product of this project. The obvious time to start working on it would have been when implementing the non-blocking server, But after considering several possible designs, I realized that I did not yet know the problem domain well enough to come up with something that is not just a cheap libevent/Boost.Asio rehash, but where I’m still sure that it performs well enough for a production-grade Thrift server implementation.</p>
<p>Thus, I went with simply porting the C++ libevent-based server implementation over to D, which has the benefits of being battle-proved, so that I have something which I can advise people to use in production code without feeling guilty. There are a few instances where I needed to manually add a GC root for some memory passed to libevent, but other than that, the code is reasonably clean, even though it surely could be prettier if a native D »event loop« was used.</p>
<p>A word of warning for Windows users: While libevent is linked dynamically as well, thus making it easy to just use DLL builds on Windows, there are some pieces of the socket code not yet tested for WinSock. Currently, I am not even sure if all of the code compiles on Windows, but I will perform some test on Windows shortly to ensure all the new additions work there as well.</p>
<h2 id="coroutine-based-tasyncclient">Coroutine-based <code>TAsyncClient</code></h2>
<p>Using an asynchronous/concurrent approach for network-related code with its intrinsic I/O latency seems like a very obvious thing to do, but to my knowledge e.g. the C++ libraries currently do not provide a generic async client implementation, which is the part of the reason I did not tackle this earlier.</p>
<p>After getting accustomed with the general idea of non-blocking I/O, it seemed to be a good time to finally work on the topic. What I basically wanted to implement was a way to off-load client-side request/response handling, possibly for multiple connections, to a worker thread, providing a <a href="http://en.wikipedia.org/wiki/Futures_and_promises">future-based</a> interface to the client code. For multiplexing handling multiple connections per worker thread, I wanted to experiment with a coroutine-based design.</p>
<p>As mentioned in the beginning of this report, coming up with a solid design took me a bit longer than expected, but as of now, <code>thrift.codegen</code> includes a fully functional <code>TAsyncClient</code> implementing such a scheme, also using libevent to have a portable means for handling non-blocking I/O. The new <code>thrift.async</code> package contains the related helper code, such as <code>TAsyncSocket</code> representing a non-blocking socket.</p>
<p>The new code is not yet well-documented or tested, and is still missing some important features like the ability to set timeouts on operations, but I have successfully tested basic use cases.</p>
<h2 id="plans-for-the-second-gsoc-half">Plans for the second GSoC half</h2>
<p>Which finally brings me to the end of this post: As my project fortunately passed the mid-term evaluations, it is now time to discuss how to go forward during the second part of the Summer of Code.</p>
<p>During the next week, I will work on some of the obviously unfinished things like async client documentation and tests, and will add a few missing utilities such as a <code>tee</code>-like transport which can be used to transparently log requests.</p>
<p>Speaking of documentation, this currently is a big issue for both the D implementation and, to a lesser extent, Thrift in general. However, as of now, I have worked sufficiently long on the code that I am effectively blind for what kinds of documentation a typical user would benefit the most from – more detailed API docs? Simple stand-alone examples with well-commented code? Tutorials? It would be great if you could let me know what you think would be useful.</p>
<p>With the non-blocking server implementation being completed, only the »performance« and »documentation« items from my original timeline remain, besides some general clean-up work being left to do. However, Nitay, my mentor, suggested a few other things which could be worth looking into, such as a generalized client for querying multiple servers, to be used for things like redundancy, load distribution, data verification, etc. I will discuss this in more detail and then update the timeline accordingly.</p>
<p class="footnote" id="fn1"><a href="#fnr1"><sup>1</sup></a> Why test in a Linux VM (Virtual Box 4.0.8) rather than directly on my OS X development box? Because Linux x86_64 is probably where most of the server-side deployments will end up, only an ancient GCC is available on OS X, DMD is still 32 bit-only there, and Valgrind/Callgrind which I used for profiling is not really usable on OS X 10.6. I am aware that using a VM might skew the results a bit, but I think the impact shouldn't be too large. Incidentally, the tests compiled and ran in the Linux VM generally performed better faster than on the host.</p>
<p class="footnote" id="fn2"><a href="#fnr2"><sup>2</sup></a> I patched KCachegrind to elide the middle of the symbol name for better readability in width-limited screenshots, and used my <a href="https://gist.github.com/1069843">own little demangling tool</a> for the D results.</p>
<p class="footnote" id="fn3"><a href="#fnr3"><sup>3</sup></a> Technically, GCC handles <code>memcpy</code> as compiler built-ins, so inlining might not be precisely the right term, but the effect (avoiding a function call) is the same.</p>
D/Thrift: Docs, Servers, Tests2011-07-04T00:00:00+01:00http://klickverbot.at/blog/2011/07/d-thrift-gsoc-docs-servers-tests<p>Dear Reader,</p>
<p>Let me apologize for not being terribly motivated to write a blog post right now, but I was lucky enough to catch the flu last week with temperatures between 25 °C and 35 °C outside, and while the fever has gone by now, I am still depleting my tissue stock at an insanely high rate…</p>
<p>Anyway, back to topic, the usual summary of the recent changes to my Thrift GSoC project:</p>
<ul>
<li>
<p>The most important item on the list from an user point of view are probably the <em>documentation</em> improvements: The project now has a <a href="https://github.com/dnadlinger/thrift/wiki/Getting-Started-with-Thrift-and-D">Getting Started page</a>, and I have made a complete pass through all the DDoc docs, a build of which is <a href="http://klickverbot.at/code/gsoc/thrift/docs/">available here</a> (I still have to whip up a nice design, but it should do for the moment).</p>
</li>
<li>
<p>More interesting from a coding perspective are the additions of two <em>new <code>thrift.server</code> implementations</em>, <code>TThreadedServer</code> and <code>TTaskPoolServer</code>. The former is a naive implementation of a threaded server which just spawns a new worker thread per client connection, while the latter uses a <code>std.parallelism</code> thread pool to process the queued client connections (the maximum number of active connections is configurable). I also added a D version of the <code>StressTest</code> server for sanity checking.</p>
</li>
<li>
<p>Another server-related change is the addition of server and processor <em>event handlers</em>, which can be used to hook custom code into various points during the server/request lifecycle, e.g. for collecting diagnostic data. Data can be persisted between the calls by saving them as connection/call »context«, which is a <code>Variant</code> the server code passes around for you (I went with variants over e.g. templating the server code on the context object type simply to avoid adding another layer of complexity for a non-essential feature).</p>
</li>
<li>
<p>I added a standalone test case exercising the different transport types (socket, file, memory buffer) combined with the various wrapper transports (buffered, framed), modeled after the C++ <code>TransportTest</code>. This has uncovered a number of (sometimes not-so-) subtle defects in the transport implementations which have been fixed (TSocket not handling <code>EINTR</code>, framed/memory buffer <code>borrow()</code> would also return smaller than requested, <code>TFileReaderTransport</code> not tailing files correctly, …).</p>
</li>
<li>
<p>Build system improvements: The stand-alone test cases are now organized in a much less cumbersome scheme, and DDoc documentation is generated for the library by default. <code>lib/d/README</code> now has instructions on how to generate a self-signed certificate for SSL socket testing.</p>
</li>
</ul>
<p>If you are on OS X, you might want to manually apply <a href="https://github.com/D-Programming-Language/phobos/pull/131">Phobos pull request 131</a> until it is merged into Git master to avoid your servers crashing due to an unhandled <code>SIGPIPE</code> (you can also just set <code>signal(SIGPIPE, …)</code> to <code>SIG_IGN</code> in your startup code).</p>
<p>I have also added a list of not yet scheduled ideas to the <a href="/code/gsoc/thrift/">project page</a>. Implementing a ZLib compression transport is currently the top item on my list, after which I will start to work on a non-blocking server implementation as planned. An asynchronous version of <code>TClient</code> is something I certainly want to implement, but I am planning to defer work on it until I have tackled the non-blocking server, as I could end up using the same approach (e.g. <code>libevent</code>) for it.</p>
D/Thrift: Compact, JSON protocols, performance2011-06-22T00:00:00+01:00http://klickverbot.at/blog/2011/06/d-thrift-gsoc-protocols-compact-json-performance<p>Another week of my Google Summer of Code project passed by, and so you are reading another status update. I am not including any core D development-related news this time, first because I didn’t do much DMD/Phobos work last week, and second because it gets tedious to list everything here – feel free to see my GitHub activity stream for more information. But still, thanks to Sean Kelly for quickly fixing the <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6135">OS X threading/GC race condition</a> I encountered the week before.</p>
<p>One of my targets last week was to do some preliminary performance investigations and using the insights gained to modify the protocol interface accordingly before I implement additional protocols. For this, I used the <code>DebugProtoTest.thrift</code>-based serialization performance test already implemented for C++ and Java (see the <a href="https://github.com/dnadlinger/thrift/blob/d-gsoc/lib/d/test/serialization_benchmark/benchmark.d">D version at GitHub</a>, a more intensive look at performance, including creation of some more extensive benchmarks is planned for later).</p>
<p>Ironically, the change with the biggest impact on the writing performance didn’t have anything to do with the protocol interface at all: When first writing <code>TMemoryBuffer</code>, I simply implemented <code>write()</code> as D array appending operation, because I didn’t want to spend much time on optimizing it yet, and I figured that as long as there would not be too many reallocations, it should be reasonably fast for testing purposes. Array appending translates to a non-inlined and not really cheap D runtime call, however, and <code>TMemoryBuffer.write()</code> unsurprisingly happens to be the single most called function in the whole writing part of the benchmark. After changing <code>TMemoryBuffer</code> to manual <code>malloc</code>/<code>free</code>-style memory management, the writing part finished in <em>less than 30%</em> of the time.</p>
<p>I tried to switch to <code>GC.malloc</code> instead of manual freeing afterwards because it would make getting a buffer content slice safe and the small memory allocation overhead should not really be a problem for typical <code>TMemoryBuffer</code> use cases (it does not matter at all in this benchmark because the required amount of memory is pre-allocated), but I encountered some strange data corruption issues in the other larger test cases I have yet to track down. Most probably, I just missed some subtleties when treating <code>GC.realloc</code> as a drop-in <code>realloc</code>/<code>free</code> replacement, but I just didn’t find a way to pin-point the issue.</p>
<p>For the next step, I tackled the design of the <code>TProtocol</code> interface: When building the first prototype for the library, I had the ad-hoc idea of passing in delegates to the aggregate reading/writing functions for processing their members. I figured that this would make the interface nicer as all the <code>*Begin()/*End()</code> pairs could be collapsed into a single call, the struct member reading loop could be moved into the protocol itself instead of being duplicated over and over again (although this is not a real benefit besides a slight code size reduction because it is generated code anyway), and implementing protocols like JSON would be easier since the structural information would not have been completely lost compared to a »flat« interface.</p>
<p>I was, however, aware of the fact that this could pose a performance problem, and indeed some experimenting showed that DMD generated suboptimal code for delegate literals and was not really able to them, even for <code>scope</code>d delegates. From a compiler point of view, this is not really surprising as generating better code would require a fair bit of analysis to be done, but still I decided to switch to a more simplistic protocol design for the time being – even more so, as I realized that my design idea would not really simplify implementing JSON-like protocols anyway. I chose to go with the C++/Java interface verbatim, as it is proven to work (and having a similar interface across multiple languages has its own merits as well), and with the changes in, I measured a <em>20% speedup</em>, even though no inlining was possible due to virtual calls all over the place yet. (In hindsight, it might have been better to implement the template mechanism first, so that the actual impact of the protocol API change would have been more visible. Maybe I’ll revert the binary protocol back to the old interface and re-run the test to get precise numbers at some point in the future.)</p>
<p>Finally, I implemented a way to specify the concrete transport/protocol types used in the application at compile-time using templates (similar to C++ and the <code>templates</code> Thrift compiler argument), thus eliminating most virtual calls and enable the compiler to inline calls all over the library. I expected to see a dramatic speedup here as well – when not specifying the protocol/transport type, the writing loop in the C++ benchmark is only half as fast –, but instead I saw »only« a <em>40% speedup</em> overall, with the C++ version still being significantly faster.</p>
<p>When comparing profiling data for the optimized C++ and D versions, I noticed that in the D version <code>_memcpy</code> gets called ten times as often as in the C++ version – GCC, being able to inline the <code>write()</code> calls, is able to replace the calls with optimized routines for shorter lengths, and since both versions spend most of their time actually copying data around at this point, this yields a huge advantage.</p>
<p>After that, I did not make any further attempts at optimizing the D version, since performance was not my primary goal at this stage anyway – the basic design seems to be solid, and what left are micro-optimizations. When focussing on performance later in the term, I will certainly create more benchmarks, and also try to optimize the languages I will compare D to (C++ and Java, most likely) – for example, the current C++ serialization benchmark from the official HEAD does a lot of unneeded work in the reading loop, moving out the initialization code makes it run twice as fast. I will also have a look at using GDC and LDC instead of DMD for their more sophisticated backends, and document the exact performance findings on various platforms.</p>
<p>Even though I am not going to write that much about it, I spent the bigger part of my time on non-performance related work: Generated structs now have an appropriate <code>toString()</code> and <code>opEquals()</code> implementation, the D ThriftTest client actually checks the data it sends/receives instead of just flooding the console with messages (no idea why this hasn’t already been implemented for C++ and Java), and last but not least, I implemented the Compact and JSON protocols for D. This completes the protocol section, as I do not plan to implement the <em>Dense</em> protocol unless there is much time left to spend during the end of the term (as previously discussed).</p>
<p>During the next (or rather: this) week, I am going to work on documentation, integrate a number of test cases I have already lying around with the repository/build system, and implement a simple multithreaded server.</p>
D/Thrift GSoC: Growing the library2011-06-14T00:00:00+01:00http://klickverbot.at/blog/2011/06/d-thrift-gsoc-growing-the-library<p>First, let me apologize for not posting an update last week – I had a busy time, but regardless I will try to let you know about the state of affairs regularly in the future. Now, what were I working on? I updated my <a href="/code/gsoc/thrift/">project page</a> based on the timeline previously discussed at my project mailing list, and – besides me being a day late, more below – it is still valid. These were the main points I worked on:</p>
<ul>
<li>
<p><em>Build system integration</em>: The D library is now integrated with the Thrift Autoconf/Automake build system. If a working D2 compiler is detected, the <code>libthriftd</code> static library containing all the modules is now built on issuing <code>make</code> along with the rest of Thrift. <code>make check</code> runs the unit tests for each D module and builds the standalone test executables (i.e. <code>ThriftTest</code> for now).</p>
</li>
<li>
<p><em>Socket transport enhancements</em>: Implemented <code>interrupt()</code> for the server socket which can be used to notify a server waiting on a blocking socket for connections about shutdown, added socket timeouts, properly handle exceptions thrown by <code>std.socket</code>, …</p>
</li>
<li>
<p>Added a D implementation of <em><code>TMemoryBuffer</code></em>, which is widely internally used and a nice tool for writing unit tests as well. Implemented the <em>Framed</em> transport in D.</p>
</li>
<li>
<p>Implemented <em><code>TFileReaderTransport</code></em> and <em><code>TFileWriterTransport</code></em>, the D equivalent to the C++ <code>TFileTransport</code>. I separated the two components because I could not really think of a situation in which you would use both at once, and conflating the two would complicate the state space (I am not even sure if the C++ implementation does what it is supposed to if read/write calls are interleaved) and make the implementation unnecessarily complex. The <code>TFileWriterTransport</code> implementation performs the actual file I/O in a separate worker thread, which communicates with the main thread using a message passing approach (leveraging D’s <code>std.concurrency</code> module).</p>
</li>
<li>
<p>A simple <em>HTTP client/server transport</em>, closely modeled after the C++ implementation.</p>
</li>
<li>
<p>An <em>SSL client/server socket</em> implementation using the OpenSSL library, which is linked in dynamically (primarily for easy Windows compatibility). The actual implementation is pretty much a direct port of the C++ <code>TSSLSocket</code>, but I had to quickly write the D2 bindings for OpenSSL first. For now, the bindings live in <code>thrift.util.openssl</code>, as I only included the subset of functions I needed for Thrift, but I might move them out in the future.</p>
</li>
</ul>
<p>As always, you can find the changes <a href="https://github.com/dnadlinger/thrift">on my GitHub fork</a>. I also spent a sizable chunk of my time on contributing some improvements and fixed to the D compiler and standard library projects. As for the issues I mentioned two weeks ago, kudos to Don Clugston for promptly fixing CTFE issues <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6077">6077</a> and <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6078">6078</a>, and my <a href="https://github.com/D-Programming-Language/dmd/pull/77">DMD pull request 77</a> and <a href="https://github.com/D-Programming-Language/phobos/pull/65">Phobos pull request 65</a> were also merged in the meantime.</p>
<p>During the last two weeks, I worked on Phobos pull requests <a href="https://github.com/D-Programming-Language/phobos/pull/73">73</a> (adds <code>std.socket.socketpair</code>), <a href="https://github.com/D-Programming-Language/phobos/pull/87">87</a> (better <code>std.file</code> error messages), <a href="https://github.com/D-Programming-Language/phobos/pull/90">90</a> (fixes a mailbox handling bug in <code>std.concurrency</code> – took me quite some time to track down as it caused sporadic deadlocks in my unit tests), <a href="https://github.com/D-Programming-Language/phobos/pull/99">99</a> (adds timeout handling and hostname lookup to <code>std.socket</code> – I still don’t know why WinSock adds 500 ms to the <code>recv()</code> timeout), <a href="https://github.com/D-Programming-Language/druntime/pull/28">druntime pull request 28</a> (adds a Posix <code>netdb.h</code> module) and <a href="https://github.com/D-Programming-Language/dmd/pull/118">DMD pull request 118</a> (finally removes the <code>_DH</code> flag).</p>
<p>Furthermore, I collaborated with Daniel Murphy on fixing the long-standing issue that function pointers are not properly typechecked, resulting in <a href="https://github.com/D-Programming-Language/dmd/pull/96">DMD pull request 96</a> and [druntime pull request 26] (https://github.com/D-Programming-Language/druntime/pull/26). I have also started to work the dreaded DMD <a href="http://d.puremagic.com/issues/show_bug.cgi?id=314">bug 314</a>. While the basic fix is in place – that’s how I found the bug in <a href="https://github.com/D-Programming-Language/phobos/pull/102">Phobos pull request 102</a> – (I adapted the D1/LDC changes by Christian Kamm to D2/DMD), I still need to add some more tests and solve a few more complex cases. Unfortunately, I also hit two new issues which I have not been able to fix yet: <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6108">6108</a>, a DMD contract inheritance bug, and <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6135">6135</a>, a druntime OSX threading/GC crash.</p>
D/Thrift GSoC: First results2011-05-29T00:00:00+01:00http://klickverbot.at/blog/2011/05/d-thrift-gsoc-first-results<p>The first week of my <a href="/code/gsoc/thrift/"><em>D/Thrift</em> project</a> as part of the <a href="http://d-programming-language.org">D programming language</a> <a href="http://www.google-melange.com/gsoc/org/google/gsoc2011/dprogramminglanguage">Google Summer of Code 2011</a> is over, and I am happy to be able to share some first results. If you are not sure what I am talking about yet: <a href="http://thrift.apache.org">Apache Thrift</a>, originally developed for internal use at <a href="http://facebook.com">Facebook</a>, is both a data serialization/RPC protocol and its reference implementation. In short, it works by defining data types and services interface in a language-agnostic interface definition file. Then, a compiler is used for generating code from that <code>.thrift</code> file (currently written in C++), using target language support libraries which contain the actual serialization protocol/RPC implementation. Currently, Thrift supports a large number of languages including C++, Java, PHP and Python.</p>
<p>It was clear from the beginning that I would stick with this approach for my implementation, not only because the informal project goal is to establish D as an equal target language besides the existing ones, but simply because one of the main strengths of Thrift is that you can use the same interface definition for all target languages, with the compiler doing all the heavy lifting for you. I did, however, want to leverage the powerful metaprogramming capabilities of D (compile-time reflection, <abbr title="Compile Time Function Execution">CTFE</abbr>, string mixins) to lift as much work off the »ahead-of-time« C++ code generator as possible, having the option to use the Thrift libraries beyond the traditional scope of the project for ad-hoc extension of existing D data types and interfaces with serialization/RPC functionality at the back of my mind.</p>
<p>My primary goal during the first week was to evaluate the feasibility of this approach by quickly implementing the basic parts of each Thrift component. In more detail, the sub-goals I tackled during the last week were:</p>
<ul>
<li>
<p>Create a preliminary implementation the central parts of the support library (<code>TBinaryProtocol</code>, <code>TBufferedTransport</code>, <code>TSocket</code>, …) using the C++ and Java implementations as reference to be able to directly test the progressing D implementation against other languages.</p>
</li>
<li>
<p>Implement the general and client-specific parts of compile-time code generation (struct reading/writing, method arguments/result struct generation <code>TClient</code>, …), using a hand-crafted Thrift tutorial interface to test it against the Java server.</p>
</li>
<li>
<p>Implement <code>TSimpleServer</code> and related basic server functionality (e.g. <code>TServerSocket</code>) to be able to test server code generation.</p>
</li>
<li>
<p>Complete missing server-side code generation bits (<code>TProcessor</code>, server-side arguments/result structs, …), again using a hand-crafted interface to test it against the Java Thrift calculator tutorial client.</p>
</li>
<li>
<p>Add D code generation to the Thrift compiler, and run the compiler against all the test interface files coming with Thrift (<code>test/*.thrift</code>) to catch any obvious issues.</p>
</li>
<li>
<p>Implement a <code>ThriftTest</code> server and client in D to exercise the more advanced serialization code paths and fix any bugs, testing it against the C++ implementation.</p>
</li>
</ul>
<p>So far, no major problems popped up, and I was able to complete the above list as planned. I did, however, hit a few bugs in DMD, which is, on the other hand, doesn’t come as a total surprise because I am heavily using the metaprogramming facilities. I have been able to find workarounds for all of the issues, but it nevertheless took me quite some time to track them down initially: issues <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6069">6069</a>, <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6077">6077</a>, <a href="http://d.puremagic.com/issues/show_bug.cgi?id=6078">6078</a>, <a href="https://github.com/D-Programming-Language/dmd/pull/77">DMD pull request 77</a> and – this one is merely an enhancement – <a href="https://github.com/D-Programming-Language/phobos/pull/65">Phobos pull request 65</a>.</p>
<p>If you want to have a look at the code, feel free to head to <a href="https://github.com/dnadlinger/thrift/tree/d-gsoc">my GitHub Thrift fork</a>, where I regularly push my work to. And just to give you a short glimpse of the very basic features (a lot more is already implemented), this is how you could implement a simple calculator server/client which adds two numbers using the Thrift library, without using any generated code.</p>
<figure class="code"> <div class="highlight"><pre><span class="c1">// This could also be generated from a .thrift file and contain</span>
<span class="c1">// structs, exceptions, etc.</span>
<span class="k">module</span> <span class="n">calculator</span><span class="p">;</span>
<span class="k">interface</span> <span class="n">Calculator</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">add</span><span class="p">(</span><span class="kt">int</span> <span class="n">lhs</span><span class="p">,</span> <span class="kt">int</span> <span class="n">rhs</span><span class="p">);</span>
<span class="p">}</span>
</pre></div><figcaption><span>Shared module containing the interface the server offers and the client consumes. </span></figcaption>
</figure>
<figure class="code"> <div class="highlight"><pre><span class="k">module</span> <span class="n">server</span><span class="p">;</span>
<span class="k">import</span> <span class="n">calculator</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">codegen</span><span class="p">.</span><span class="n">processor</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">protocol</span><span class="p">.</span><span class="n">binary</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">protocol</span><span class="p">.</span><span class="n">processor</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">server</span><span class="p">.</span><span class="n">simple</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">transport</span><span class="p">.</span><span class="n">buffered</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">transport</span><span class="p">.</span><span class="n">serversocket</span><span class="p">;</span>
<span class="k">class</span> <span class="n">CalculatorHandler</span> <span class="p">:</span> <span class="n">Calculator</span> <span class="p">{</span>
<span class="k">override</span> <span class="kt">int</span> <span class="n">add</span><span class="p">(</span><span class="kt">int</span> <span class="n">lhs</span><span class="p">,</span> <span class="kt">int</span> <span class="n">rhs</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="n">n1</span> <span class="p">+</span> <span class="n">n2</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Expose a CalculatorHandler instance at port 9090.</span>
<span class="k">auto</span> <span class="n">protocolFactory</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TBinaryProtocolFactory</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">processor</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TServiceProcessor</span><span class="p">!</span><span class="n">Calculator</span><span class="p">(</span>
<span class="k">new</span> <span class="n">CalculatorHandler</span><span class="p">());</span>
<span class="k">auto</span> <span class="n">serverTransport</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TServerSocket</span><span class="p">(</span><span class="mi">9090</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">transportFactory</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TBufferedTransportFactory</span><span class="p">();</span>
<span class="k">auto</span> <span class="n">server</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TSimpleServer</span><span class="p">(</span>
<span class="n">processor</span><span class="p">,</span> <span class="n">serverTransport</span><span class="p">,</span> <span class="n">transportFactory</span><span class="p">,</span> <span class="n">protocolFactory</span><span class="p">);</span>
<span class="n">server</span><span class="p">.</span><span class="n">serve</span><span class="p">();</span>
<span class="p">}</span>
</pre></div><figcaption><span>Server implementation, accepting connections on port 9090 using the binary protocol. </span></figcaption>
</figure>
<figure class="code"> <div class="highlight"><pre><span class="k">module</span> <span class="n">client</span><span class="p">;</span>
<span class="k">import</span> <span class="n">calculator</span><span class="p">;</span>
<span class="k">import</span> <span class="n">std</span><span class="p">.</span><span class="n">stdio</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">codegen</span><span class="p">.</span><span class="n">client</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">protocol</span><span class="p">.</span><span class="n">binary</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">transport</span><span class="p">.</span><span class="n">buffered</span><span class="p">;</span>
<span class="k">import</span> <span class="n">thrift</span><span class="p">.</span><span class="n">transport</span><span class="p">.</span><span class="n">socket</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Set up a client for the Calculator interface and try to</span>
<span class="c1">// connect to localhost:9090.</span>
<span class="k">auto</span> <span class="n">socket</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TSocket</span><span class="p">(</span><span class="s">"localhost"</span><span class="p">,</span> <span class="mi">9090</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">transport</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TBufferedTransport</span><span class="p">(</span><span class="n">socket</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">protocol</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TBinaryProtocol</span><span class="p">(</span><span class="n">transport</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">TClient</span><span class="p">!</span><span class="n">Calculator</span><span class="p">(</span><span class="n">protocol</span><span class="p">);</span>
<span class="n">transport</span><span class="p">.</span><span class="n">open</span><span class="p">();</span>
<span class="c1">// Call the server's add() method and print the result.</span>
<span class="k">auto</span> <span class="n">lhs</span> <span class="p">=</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">rhs</span> <span class="p">=</span> <span class="mi">3</span><span class="p">;</span>
<span class="k">auto</span> <span class="n">sum</span> <span class="p">=</span> <span class="n">client</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">lhs</span><span class="p">,</span> <span class="n">rhs</span><span class="p">);</span>
<span class="n">writefln</span><span class="p">(</span><span class="s">"%s + %s = %s"</span><span class="p">,</span> <span class="n">lhs</span><span class="p">,</span> <span class="n">rhs</span><span class="p">,</span> <span class="n">sum</span><span class="p">);</span>
<span class="p">}</span>
</pre></div><figcaption><span>Client implementation. Note how the interface defined above in the <code>calculator</code> module is passed to TClient as a template parameter, which then generates the necessary RPC code. </span></figcaption>
</figure>
Random D development news2011-04-26T00:00:00+01:00http://klickverbot.at/blog/2011/04/random-d-development-news<p>During the last couple of weeks, I didn’t really find time to update this blog. Nevertheless, however, I was able to spare some time for work on a couple open source projects related to the <a href="http://d-programming-language.org">D programming language</a>. But first, let me quickly summarize some great changes that will be in the next <span class="caps">DMD</span> release:</p>
<p>Don Clugston has basically <a href="https://github.com/D-Programming-Language/dmd/pull/23">re-implemented <span class="caps">CTFE</span></a> to fix a whole slew of compile-time function execution bugs, among which is the dreaded <a href="http://d.puremagic.com/issues/show_bug.cgi?id=1330">bug 1330</a>. There are still some regressions compared to <span class="caps">DMD</span> 2.052 (like <a href="http://lists.puremagic.com/pipermail/dmd-internals/2011-April/001448.html">this one</a>, which breaks QtD), but apart from those, it’s a big step towards getting <span class="caps">CTFE</span> out of the »experimental feature« category. The new architecture will also make implementing reference types easier, but this is still a long way off. Then next <span class="caps">DMD</span>/Phobos release will also include the new <a href="http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html">std.parallelism</a> module by David Simcha, some GC optimizations and a large amount of other improvements (among which is the addition of the <a href="https://github.com/D-Programming-Language/dmd/commit/2e261cd640e5266c569ad224ffbfe229a0315d97">parent trait</a>, so that QtD doesn’t need a patched <span class="caps">DMD</span> any longer) – due to the GitHub migration and the larger part of x86_64 support being done, the perceived development speed in the core community really went up a notch.</p>
<p>As for my own (insignificant, compared to the above) contributions, I did some work on <a href="http://dsource.org/projects/ldc"><span class="caps">LDC</span></a> during the last few days, porting it to <a href="http://llvm.org/"><span class="caps">LLVM</span> 2.9</a> and bringing the front-end in sync with <a href="http://digitalmars.com/d/1.0/changelog.html"><span class="caps">DMD</span> 1.067</a> – you can find the changes in the default branch over at <a href="https://bitbucket.org/lindquist/ldc">Bitbucket</a>. The <span class="caps">DMD</span> updates also contained some changes to the varargs <span class="caps">ABI</span> on x86_64 and other areas of the runtime interface, which I didn’t merge yet, because it would require an update to Tango as well. I am not aware of any regressions so far (see the <a href="/code/ldc/">DStress results</a>), but feel free to ping me in case of any problems.</p>
<p>There were also some updates and bug fixes to D support in <a href="http://swig.org"><span class="caps">SWIG</span></a>, most notably support for the <a href="http://swig.org/Doc2.0/D.html#D_nspace">nspace feature</a>, which allows you to map C++ namespaces to D packages/modules (it doesn’t work for free functions and global variables yet, but this is a general <span class="caps">SWIG</span> restriction that could be easily lifted, just ask me if you need it). There was another <span class="caps">SWIG</span> release in the meantime, version 2.0.3, but it was only a »quick backup« by the maintainer before he merged some intrusive Python changes. I was caught pretty much off-guard by it and had no time for real testing and thus, it contains some bugs (mainly related to nspace support when split-proxy mode is not activated, thanks to Jonathan Pfau for the reports) – please use <span class="caps">SVN</span> trunk instead.</p>
<p>Another little project I recently worked on is <a href="/code/units/">std.units</a>, an units of measurement implementation for D. This topic came up several times on the NG previously, and every time it was suggested to add units support with Phobos, so I have merged the work into my Phobos fork. Please note, however, that this is in no way a formal review request yet. There are still a couple of items left on my to-do list, but before I am tackling the remaining issues, I’d greatly appreciate some feedback (see the thread on the D newsgroup, <a href="http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=134590" title="Phobos?"><span class="caps">RFC</span>: Units of measurement for D</a>).</p>
<p>Finally, a personal note: Yesterday, I received notice that I was accepted to work on my <a href="/code/gsoc/thrift">Apache Thrift project</a> under the umbrella of Digital Mars as part of the <a href="http://www.google-melange.com/gsoc/homepage/google/gsoc2011">Google Summer of Code 2011</a> – thanks a lot to everybody who supported my proposals for their trust in me! I know that the expectations are high, and will do my very best to live up to them.</p>Quake-style drop-down terminal on OS X2011-02-27T00:00:00+00:00http://klickverbot.at/blog/2011/02/quake-style-terminal-on-osx<p>I’m currently using OS X 10.6 on my MacBook Pro and the combination of a polished UI and the familiar Posix foundation quite appeals to me (I’ll probably do a separate post on my experience with it eventually). Nevertheless, however, I am using the text console a lot, obviously when doing development work, but also for lots of everyday stuff I can still do faster with it.</p>
<p>Because I’m often using the console side-by-side with <span class="caps">GUI</span> applications, I found it really useful to be able to access a console overlay via a system-wide hotkey, just like in good old Quake (which I never personally played, by the way). This should give you an idea how it looks like:</p>
<figure><img alt="A screenshot of the drop-down console in Quake 4 overlaying the title screen." src="/blog/2011/02/quake-style-terminal-on-osx/quake4-console.png" /><figcaption>The in-game console at the Quake 4 title screen, toggled by the tilde key.</figcaption></figure>
<p>It doesn’t, unfortunately, come as a surprise that the OS X terminal application doesn’t support this out of the box, but fortunately there are several third-party tools for archiving this. First, I tried out <a href="http://visor.binaryage.com/"><em>Visor</em></a>, a <a href="http://www.culater.net/software/SIMBL/SIMBL.php"><em><span class="caps">SIMBL</span></em></a> plugin for Apple’s <em>Terminal</em>, which provides more or less exactly what I was looking for. Unfortunately, it turned out to be not quite as stable as I hoped (random crashes from time to time), and Terminal.app itself has the annoying habit of not reacting to input quite often, especially after killing a interactive console app with <code>Ctrl+C</code>.</p>
<p>But a few days ago, I discovered <a href="http://sites.google.com/site/iterm2home/"><em>iTerm2</em></a>, a replacement for the system terminal application, which also supports a system-wide hotkey to hide/unhide the console window and doesn’t suffer the annoying lock up problem with <em>Terminal</em>. It is still in alpha at the moment, but even the nightly build I am currently using (0.20.20110226) has been stable so far.</p>
<p>Just resizing the window to the top third of the screen and using it as an overlay does not quite work when using multiple virtual desktops (»Spaces« in OS X terminology) though: The console window always appears on the same desktop, even if you are working on another one, causing the window manager to switch desktops, which defeats the purpose you were using an overlay in the first place. Fortunately there happens to be an easy solution for this as well: The <a href="http://infinite-labs.net/afloat/"><em>Afloat</em></a> »window manager plugin« enables you to keep a window on all spaces, among a wealth of other power-user friendly features.</p>
<p>There is currently another minor quirk with iTerm2: Even though it defaults to <span class="caps">UTF</span>-8 input, it does not set the <code>LC_CTYPE</code> environment variable accordingly, which caused some problems with Ruby 1.9 applications for me (random encoding-related errors like »invalid byte sequence in US-<span class="caps">ASCII</span> (ArgumentError)«). The simple workaround is to add an <code>export</code> line to your <code>.profile</code>.</p>SWIG 2.0.2 with D support released2011-02-21T00:00:00+00:00http://klickverbot.at/blog/2011/02/swig-2-0-2-with-d-support-released<p>Yesterday, <a href="http://swig.org"><span class="caps">SWIG</span></a> version 2.0.2 has been <a href="http://sourceforge.net/news/?group_id=1645&id=297686">officially released</a>. Along with various bug fixes for the other supported languages, this is the first release to support the <a href="http://d-programming-language.org">D programming language</a>. As always, you can get the release from the <a href="http://swig.org/download.html">download area</a>, but here are direct links to the files hosted at SourceForge for your convenience: One for the <a href="http://prdownloads.sourceforge.net/swig/swig-2.0.2.tar.gz">source tarball</a>, and another for <em><a href="http://prdownloads.sourceforge.net/swig/swigwin-2.0.2.zip">swigwin</a></em> which includes a pre-built Win32 executable.</p>
<p>Since my <a href="/blog/2010/11/announcing-d-support-in-swig/">first announcement</a>, there were a number of changes and improvements. Along them were some critical fixes to the generated code when compiled on Windows, some minor ones regarding name collision in the D part, and a fix to the »directors« feature where a wrong C++ method would be called silently under certain circumstances (thanks to Jimmy Cao for reporting). Unfortunately, there were also some <a href="/blog/2010/12/swig-d-breaking-name-changes/">breaking name changes</a>, as previously mentioned on this blog. Furthermore, I added basic support for operator overloading, please refer to the <a href="http://www.swig.org/Doc2.0/D.html#D_operator_overloading">documentation</a> for details.</p>
<p>If you have any questions or need assistance with using <span class="caps">SWIG</span> on a certain library, feel free to <a href="/#contact">contact</a> me directly or to post to the <a href="http://swig.org/mail.html">swig-user</a> mailing list. During the next few days, I will be quite busy and cannot promise you to reply quickly, but after that, I will be happy to help. Oh, and it would be great if you could share your personal experiences, common pitfalls and how to overcome them when using <span class="caps">SWIG</span> for the first time, since »Getting Started«-style documentation for people new to <span class="caps">SWIG</span> is a bit scarce at the moment!</p>git reset using Mercurial2011-01-09T00:00:00+00:00http://klickverbot.at/blog/2011/01/git-reset-using-mercurial<p>I am mainly a <a href="http://git-scm.com">Git</a> user, but lately I have been working with <a href="http://mercurial.selenic.com/">Mercurial</a> from time to time. I have been mostly using it for basic committing though, so I still occasionally end up with a commit I did not mean to create like this when performing more advanced operations. But undoing that can’t be too hard, right?</p>
<p>For example, I recently toyed around with <a href="http://mercurial.selenic.com/wiki/MqExtension">Mercurial Queues</a> to emulate Git’s staging area, one of the features that seem trivial, but which you don’t want to miss at any cost once you are used to it. Doing so, while being at, let’s say, revision <code>1000</code>, I accidentally created two changesets, <code>1001</code> and <code>1002</code>. Now, how do I get rid of these while still keeping the contents of them in the working copy? Using Git, this would just be <code>git reset HEAD~2</code>. But unfortunately, Mercurial seems to make your life somewhat hard in this case, this is what I came up with (please leave me a message in the comments in case I missed an easier way):</p>
<figure class='code'> <div class="highlight"><pre>hg update -r1000<br />
</pre></div></figure>
<p>This sets the working copy to the last »good« revision, only to …</p>
<figure class='code'> <div class="highlight"><pre>hg revert —all -r1002<br />
</pre></div></figure>
<p>… bring the changes from the two commits back to the working copy (but without committing them this time), so we can now …</p>
<figure class='code'> <div class="highlight"><pre>hg strip —force 1001<br />
</pre></div></figure>
<p>… strip the two changesets from the history.</p>
<p>I am perfectly aware of the fact that any other <span class="caps">SCM</span> tool will probably seem clumsy at first if I am used to Git (besides the fact that Git seems to be a natural fit for the way I think about versioning), but I still wonder whether there a deeper reason for Mercurial not to support this more directly.</p>Breaking name changes in SWIG/D2010-12-01T00:00:00+00:00http://klickverbot.at/blog/2010/12/swig-d-breaking-name-changes<p>Sorry if this notice might come a bit late for some of you, but a few days ago, I have committed a breaking change to <a href="/blog/2010/11/announcing-d-support-in-swig/">D support</a> in <a href="http://swig.org"><span class="caps">SWIG</span></a> trunk. It was needed to bring the names used in the D module in line with the C# one, the naming scheme of which was intended to be language-independent by the principal maintainer (although it is only used in the C# and D parts right now).</p>
<p>Most of the changes revolve around the term »wrap D module« being replaced with »intermediary D module«, including names derived from it. To adapt your interface files, just perform the following replacements:</p>
<figure class='code'> <div class="highlight"><pre>s/cwtype/ctype/g<br />
s/dwtype/imtype/g<br />
s/dptype/dtype/g<br />
<br />
s/<span class="nv">$wcall</span>/<span class="nv">$imcall</span>/g<br />
s/<span class="nv">$dpcall</span>/<span class="nv">$dcall</span>/g<br />
<br />
s/wrapdmodule/imdmodule/g<br />
</pre></div></figure>Announcing: D support in SWIG2010-11-21T00:00:00+00:00http://klickverbot.at/blog/2010/11/announcing-d-support-in-swig<p>In a nutshell, <a href="http://swig.org"><span class="caps">SWIG</span></a> is a »glue code« generator, allowing you to access C/C++ libraries from various target languages, including C#, Go, Java, Ruby, Python … and, since I merged my work into <span class="caps">SWIG</span> trunk a few days ago, also the <a href="http://digitalmars.com/d/">D programming language</a>, both version 1 and 2.</p>
<p>Why would D support in <span class="caps">SWIG</span> be useful in the first place? After all, D is perfectly able to <a href="http://www.digitalmars.com/d/1.0/interfaceToC.html">interface with C</a> on its own, so why bother using a third-party tool?</p>
<p>Well, it turns out that even for »plain old C«, there are reasons why you’d want to use a bindings generator. Besides the obvious problem that you have to convert the C header files to D modules somehow, there is one major inconvenience with directly using C libraries from D: D code usually is on a higher abstraction level than C, and many of the features that make D interesting are simply not available when dealing with C libraries. For instance, you would have to manually convert strings between pointers to <code>\0</code>-terminated char arrays and D <code>string</code>s, and most interesting algorithms from the D2 standard library are simply unusable with C arrays.</p>
<p>While these issues can be worked around relatively easy by hand-coding a thin wrapper layer around the C library in question, there is another issue where writing wrapper code per hand is not feasible: C++ class libraries. D1 does not support interfacing with C++ at all, and even if <code>extern(C++)</code> has been added to D2, the support is quite limited, and a custom wrapper layer is still required in many cases.</p>
<p>Here is, without further ado, a small example of what the D module for <span class="caps">SWIG</span> allows you to do. Consider the following (admittedly not very useful) piece of C++ code:</p>
<figure class='code'> <div class="highlight"><pre><span class="k">typedef</span> <span class="n">std</span><span class="o">::</span><span class="n">pair</span><span class="o"><</span><span class="kt">float</span><span class="p">,</span> <span class="kt">float</span><span class="o">></span> <span class="n">Position</span><span class="p">;</span><br />
<br />
<span class="k">class</span> <span class="nc">Shape</span> <span class="p">{</span><br />
<span class="k">public</span><span class="o">:</span><br />
<span class="n">Shape</span><span class="p">(</span> <span class="n">Position</span> <span class="n">pos</span> <span class="p">)</span> <span class="o">:</span> <span class="n">m_position</span><span class="p">(</span> <span class="n">pos</span> <span class="p">)</span> <span class="p">{}</span><br />
<span class="k">virtual</span> <span class="o">~</span><span class="n">Shape</span><span class="p">()</span> <span class="p">{}</span><br />
<br />
<span class="k">virtual</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">getDescription</span><span class="p">()</span> <span class="k">const</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span><br />
<br />
<span class="n">Position</span> <span class="n">getPosition</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span><br />
<span class="k">return</span> <span class="n">m_position</span><span class="p">;</span><br />
<span class="p">}</span><br />
<br />
<span class="k">protected</span><span class="o">:</span><br />
<span class="n">Position</span> <span class="n">m_position</span><span class="p">;</span><br />
<span class="p">};</span><br />
<br />
<span class="k">class</span> <span class="nc">Circle</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Shape</span> <span class="p">{</span><br />
<span class="k">public</span><span class="o">:</span><br />
<span class="n">Circle</span><span class="p">(</span> <span class="n">Position</span> <span class="n">pos</span> <span class="p">)</span> <span class="o">:</span> <span class="n">Shape</span><span class="p">(</span> <span class="n">pos</span> <span class="p">)</span> <span class="p">{}</span><br />
<span class="k">virtual</span> <span class="o">~</span><span class="n">Circle</span><span class="p">()</span> <span class="p">{}</span><br />
<br />
<span class="k">virtual</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">getDescription</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span><br />
<span class="k">return</span> <span class="s">"A perfect circle."</span><span class="p">;</span><br />
<span class="p">}</span><br />
<span class="p">};</span><br />
<br />
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">toString</span><span class="p">(</span> <span class="k">const</span> <span class="n">Shape</span><span class="o">&</span> <span class="n">shape</span> <span class="p">)</span> <span class="p">{</span><br />
<span class="n">std</span><span class="o">::</span><span class="n">ostringstream</span> <span class="n">result</span><span class="p">;</span><br />
<br />
<span class="n">Position</span> <span class="n">p</span> <span class="o">=</span> <span class="n">shape</span><span class="p">.</span><span class="n">getPosition</span><span class="p">();</span><br />
<span class="n">result</span> <span class="o"><<</span> <span class="s">"A shape at ("</span> <span class="o"><<</span> <span class="n">p</span><span class="p">.</span><span class="n">first</span> <span class="o"><<</span> <span class="s">", "</span> <span class="o"><<</span> <span class="n">p</span><span class="p">.</span><span class="n">second</span> <span class="o"><<</span> <span class="s">")."</span><span class="p">;</span><br />
<span class="n">result</span> <span class="o"><<</span> <span class="s">" It looks like this: "</span> <span class="o"><<</span> <span class="n">shape</span><span class="p">.</span><span class="n">getDescription</span><span class="p">();</span><br />
<br />
<span class="k">return</span> <span class="n">result</span><span class="p">.</span><span class="n">str</span><span class="p">();</span><br />
<span class="p">}</span><br />
</pre></div></figure>
<p>By using <span class="caps">SWIG</span> to generate the necessary glue code, you can easily make the classes available in D, as demonstrated by the following small program:</p>
<figure class='code'> <div class="highlight"><pre><span class="k">class</span> <span class="n">Square</span> <span class="p">:</span> <span class="n">Shape</span> <span class="p">{</span><br />
<span class="k">this</span><span class="p">(</span> <span class="n">Position</span> <span class="n">pos</span> <span class="p">)</span> <span class="p">{</span><br />
<span class="k">super</span><span class="p">(</span> <span class="n">pos</span> <span class="p">);</span><br />
<span class="p">}</span><br />
<br />
<span class="k">override</span> <span class="nb">string</span> <span class="n">getDescription</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span><br />
<span class="k">return</span> <span class="s">"Quite square-ish."</span><span class="p">;</span><br />
<span class="p">}</span><br />
<span class="p">}</span><br />
<br />
<span class="kt">void</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span><br />
<span class="c1">// One of the ugliest bugs currently in D: Type inference does not</span><br />
<span class="c1">// work correctly for arrays of classes with a common supertype.</span><br />
<span class="k">auto</span> <span class="n">shapes</span> <span class="p">=</span> <span class="p">[</span><br />
<span class="k">cast</span><span class="p">(</span> <span class="n">Shape</span> <span class="p">)</span> <span class="k">new</span> <span class="n">Circle</span><span class="p">(</span> <span class="k">new</span> <span class="n">Position</span><span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span> <span class="p">)</span> <span class="p">),</span><br />
<span class="k">new</span> <span class="n">Square</span><span class="p">(</span> <span class="k">new</span> <span class="n">Position</span><span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span> <span class="p">)</span> <span class="p">)</span><br />
<span class="p">];</span><br />
<br />
<span class="k">foreach</span> <span class="p">(</span> <span class="n">shape</span><span class="p">;</span> <span class="n">shapes</span> <span class="p">)</span> <span class="p">{</span><br />
<span class="n">writeln</span><span class="p">(</span> <span class="n">toString</span><span class="p">(</span> <span class="n">shape</span> <span class="p">)</span> <span class="p">);</span><br />
<span class="p">}</span><br />
<span class="p">}</span><br />
</pre></div></figure>
<figure><pre><code>A shape at (1, 3). It looks like this: A perfect circle.
A shape at (2, 1). It looks like this: Quite square-ish.</code></pre></figure>
<p>Note that <code>Shape</code> is extended on the D side just as usual and how the C++ call to <code>getDescription()</code> is transparently routed to <code>Square.getDescription()</code>. This mechanism dubbed <em>cross language polymorphism</em> is enabled by a feature of <span class="caps">SWIG</span> called »directors«, which causes the extra indirection layer needed for this to be emitted. Also note how the strings are seamlessly converted between their C++ and D representation.</p>
<p>So you want to give the D module in <span class="caps">SWIG</span> a whirl? Just head over to the <a href="https://swig.svn.sourceforge.net/svnroot/swig/trunk/"><span class="caps">SWIG</span> <span class="caps">SVN</span></a>, grab the sources from there, and <a href="http://swig.org/svn.html">build it</a>. If you are planning to run the test suite or the included examples, you might want to specify <code>--with-d1-compiler=<…></code> and <code>--with-d2-compiler=<…></code> at the <code>configure</code> command line. In case you want to play around with the small example from above, I also put up a <a href="demo.zip">small archive</a> containing the files (for such a small example, the C++ code could be included directly in the <span class="caps">SWIG</span> interface file via the <code>%inline</code> directive, but that’s how you would probably want to tackle a real library).</p>
<p>What can you expect to work? The test-suite which covers all the basic features of <span class="caps">SWIG</span> should build and run fine, which means that it will probably just work when trying to wrap a library. The source tree includes also a documentation chapter on D (<code>Doc/Manual/D.html</code>) which describes the basic structure and some of the D-specific features. As the D module started out as a fork from the C# one, the documentation on C# could be of considerable use for you as well.</p>
<p>There are still a few areas which need serious work, though. One of them is <em>operator overloading</em>, where both semantics and implementation differ quite a lot between C++ and D. It would probably be not too hard to come up with a solution (maybe using D’s extensive compile-time reflection capabilities to avoid having to add special cases to the <span class="caps">SWIG</span> module), but I would really appreciate some help from someone actually needing it here.</p>
<p>The other big one is <em>multithreading support</em>. Since I personally have not needed to use C++ libraries from D in a threaded setting yet, I have not really thought about the problems arising from multiple threads calling the wrapper code. Especially in combination with the garbage collector, I expect quite a lot of issues to pop up in a serious multithreaded environment. There are a few places which include threading-related code (<code>synchronized</code>, <code>shared</code>, …), but these are mostly remnants from the C# module, which may or may not apply to D – once again, I would be happy if somebody needing this would help me out here.</p>
<p>Speaking of C# remnants: As mentioned above, the D module was forked from the C# module, which in turn started out as a fork from the Java one. Due to this heritage, there are a few places where things could be done much easier in D. For example, the code for <em>returning C strings to D</em> without memory leaks is unnecessarily complex at the moment. But the same applies here as well – I would be happy to support anyone wanting to clean this up, but the current implementation did its job for me so far.</p>
<p>Anyway, I would be glad if some of you could go ahead and put <span class="caps">SWIG</span> to real-world use, so that any major bug can be fixed before the next <span class="caps">SWIG</span> release (not planned so far). If you stumble upon any issues or if any questions should arise, please feel free to contact me, either via <a href="/about#contact">mail</a>, on <a href="http://www.digitalmars.com/webnews/newsgroups.php?group=digitalmars.D">digitalmars.D</a> or in <a href="irc://irc.freenode.net/D">#D on freenode</a>. Besides that, as always, it would also be nice just to hear about what you are doing with this.</p>
<p class="update">In the meantime, two severe bugs in the code generated for Windows have been fixed; please be sure to use the latest version from <span class="caps">SVN</span>.</p>Oh, how glad I am that ActionScript 2 is dead and buried…2010-06-08T00:00:00+01:00http://klickverbot.at/blog/2010/06/oh-how-glad-i-am-that-actionscript-2-is-dead-and-buried<p>… but unfortunately not in a personal project of mine that I started quite a while ago and which I have resumed work on recently.</p>
<p>Today, I have been finally able to fix a bug which had already taken me some hours to trace down. Basically, mouse-over and -off events would not work properly on certain <code>MovieClip</code>s. After some digging through my custom code managing these events (which I needed to come up with because there is no way to let hover events bubble up the display hierarchy in ActionScript 2), I found that <code>hitTest()</code> wouldn’t work properly on these clips.</p>
<p>Now the fun part began. I meticulously checked every aspect of the <code>MovieClip</code>s for anything special, I even considered that it might have something to do with the fact that they were positioned right behind some <code>TextField</code>s, which could have triggered some Flash player bugs (given that they are already surrounded by a cloud of weirdness in AS2). Nope, nothing.</p>
<p>It wasn’t until I had already pretty much given up that I noticed that the name of the clips in question contained a period. After I removed it … <code>hitTest()</code> worked fine – thanks a lot for wasting my time, Adobe! Not only that this really should not happen, at least not without a runtime warning, the fact that you cannot use periods in <code>MovieClip</code> names is apparently undocumented.</p>
<p>Oh well…</p>STL Algorithms2010-04-20T00:00:00+01:00http://klickverbot.at/blog/2010/04/stl-algorithms<p>While attending a workshop at <a href="http://linz.linuxwochen.at">Linuxwochen Linz</a> recently, I found using <code>std::for_each</code> and other algorithms from the C++ Standard Template library without even really thinking about it, much to the surprise of the other workshop attendants, which were, contrary to me, rather artists than coders. As I were thinking about the way my C++ coding style evolved over the years, I remembered that my use of <code><algorithm></code>s can be traced back to a single article on Dr. Dobb’s: <a href="http://www.drdobbs.com/cpp/184401446"><span class="caps">STL</span> Algorithms vs. Hand-Written Loops</a>.</p>
<p>The text was written by Scott Meyers almost ten years ago, but in my eyes, it is still a very good read on the topic. Highly recommended!</p>The Joys of OPTLINK2010-02-13T00:00:00+00:00http://klickverbot.at/blog/2010/02/the-joys-of-optlink<p>As you might know, <span class="caps">DMD</span>/Windows (the reference compiler for the <a href="http://www.digitalmars.com/d/1.0/">D</a> programming language) does not use the standard <span class="caps">COFF</span> format for the object files it generates, but the fairly obscure <span class="caps">OMF</span> instead. This fact itself causes quite a number of annoyances. For example, the format differences make it unable to link static libraries produced by other compilers to D projects, which is especially annoying since it also applies to <span class="caps">DLL</span> import libraries. You also cannot use any tools which expect object files in <span class="caps">COFF</span> format and vice versa.</p>
<p>However, all of these issues, as annoying as they may be, do not pose a serious problem, they can all be worked around. But there is another one, and it has seven letters: <span class="caps">OPTLINK</span>. <span class="caps">OPTLINK</span>, courtesy of Digital Mars, is the linker which comes with <span class="caps">DMD</span>. There are quite a number of issues with it:</p>
<p>First, it is proprietary closed-source software. Apart from some people’s idealistic worries, this also poses a serious problem to more pragmatically inclined coders because <em>there are no alternative linkers</em> for <span class="caps">OMF</span>, at least no even half-decent ones. This means that if you stumble upon a bug, you can do nothing more than to wait for Walter Bright to fix it.</p>
<p>Second, even if the source code was available, it would probably still be hard to fix bugs, since, according to Walter himself, large parts are written in assembler – a <em>maintainer’s nightmare</em>. This might also explain why it took him so long to fix some serious bugs in the past…</p>
<p>Third, there are bugs. <em>Lots of bugs</em>, compared to other linkers and with the pretty high version number (8.00.2) in mind. If you want to know what I am talking about, just search the D newsgroups; projects which make extensive use of templates seem to be affected more often than others. Until yesterday, I personally had been spared from this kind of issues, but the <span class="caps">OPTLINK</span> bug I encountered yesterday almost drove me crazy, because one wouldn’t expect this at all: <!--more--></p>
<p>After I had worked quite some time on Linux exclusively, I needed to compile a Windows version of a project of mine. So I went ahead and rebooted, updated <span class="caps">DMD</span>, Tango and a few other tools. Everything worked fine, the project even built fine, until I needed to build debug symbols into the binary. Every time I just added the <code>-g</code> flag to the compiler invocation, <span class="caps">OPTLINK</span> would abort with »Error 118: Filename Expected«. Because I had also upgraded my build tool, my first thought was that the linker commands could really be broken, but on closer inspection, it turned out that the invocation was generated perfectly fine. So I went on and downgraded all of the tools again, but to no avail – again the same error, although debug builds had worked flawlessly in the past.</p>
<p>After having searched for about an hour, I finally found the cause, and I could not really believe it at first: Compared to my previous D/Windows setup, I had added the Notepad++ installation directory to my <code>PATH</code>. You might ask yourself now, »Um, what? How should that break the linker?« Well, it turned out that <span class="caps">OPTLINK</span> apparently has problems with handling plus signs in all the lookup paths it uses, including not only the ones passed at the command line, but also those from the environment variables.</p>
<p>For a second I was really tempted to just drop <span class="caps">DMD</span> altogether, but unfortunately, there currently is no other D compiler of comparable quality for Windows. In my eyes, it would really help if <span class="caps">DMD</span> used <span class="caps">COFF</span> for its object files, making it possible to easily switch out <span class="caps">OPTLINK</span>, since the maturity of the tool-chain is currently the number one problem of D.</p>Setting up GDC and Tango on Linux x862009-10-26T00:00:00+00:00http://klickverbot.at/blog/2009/10/setting-up-gdc-and-tango-on-linux-x86<p>Currently, there are three more-or-less working compilers for the <a href="http://digitalmars.com/d/">D programming language</a> (version 1): The oldest and most mature one is <span class="caps">DMD</span>, short for Digital Mars D Compiler, the official reference implementation by Walter Bright, the creator of D. It has grown reasonably stable, but has certain limitations, most of them resulting from using a proprietary back-end. Additionally, not all parts of it are Open Source (starting with a capital letter). The second one is <a href="http://dsource.org/projects/ldc"><span class="caps">LDC</span></a>, a rather young, but quick-moving project which aims to port the front-end of <span class="caps">DMD</span> to the also fairly recent <a href="http://llvm.org"><span class="caps">LLVM</span></a> compiler framework in order to leverage its advanced code generation and optimization infrastructure. While it still has some bugs to iron out (most notably missing exception support on Windows), it works reasonably well on Linux x86 (32 and 64 bit). The third compiler, and subject of interest for this post, is <span class="caps">GDC</span>. Like the other two compilers, it uses the Digital Mars D front-end, but coupled to the very mature <span class="caps">GNU</span> Compiler Collection (<span class="caps">GCC</span>) back-end, whose C/C++ compiler is widely used on Unix-like systems like Linux, Mac OS X, various flavors of <span class="caps">BSD</span> and also Windows through <a href="http://mingw.org">MinGW</a>. Unfortunately, development on it has stalled, making it pretty much unusable due to the many bugs the old <span class="caps">DMD</span> front-end it uses contains.</p>
<p>However, there has been an effort started to resurrect <span class="caps">GDC</span> recently. Development takes place over at <a href="http://bitbucket.org/goshawk/gdc">bitbucket</a> (you can also find building instructions for <span class="caps">GDC</span> there) and the project has been able to celebrate some first success: The reasonably recent front-end versions 1.040 and 2.015 (for D2) are working with <span class="caps">GCC</span> 4.3. This seemed enough of a sign of life for me to decide to give <span class="caps">GDC</span> another try. After some initial problems (some of which resulted from bugs which have already been fixed in the official Mercurial repository) I managed to compile a <span class="caps">GDC</span> binary (frontend version 1.040 against <span class="caps">GCC</span> 4.3.1) which happily compiles the <a href="http://dsource.org/projects/tango">Tango</a> standard library and a personal project of mine. This is what I did (silently omitting quite a few hours of searching and fixing bugs): <!--more--></p>
<p>First, go to some temporary directory and checkout the <span class="caps">GDC</span> sources from the Mercurial repository (at the time of writing, revision 53 was current):</p>
<figure class='code'> <div class="highlight"><pre><span class="nb">cd</span> ~/tmp<br />
hg clone http://bitbucket.org/goshawk/gdc<br />
</pre></div></figure>
<p>Then, download the core of <span class="caps">GCC</span> 4.3.1 from a <a href="http://gcc.gnu.org/mirrors.html">mirror near you</a> (version 4.3.2 should also work, but builds against 4.3.4 are currently known to be broken) and extract it inside the <span class="caps">GDC</span> sources:</p>
<figure class='code'> <div class="highlight"><pre>wget ftp://gd.tuwien.ac.at/gnu/gcc/releases/gcc-4.3.1/gcc-core-4.3.1.tar.bz2<br />
mkdir gdc/dev<br />
<span class="nb">cd </span>gdc/dev<br />
tar xjvf ../../gcc-core-4.3.1.tar.bz2<br />
</pre></div></figure>
<p>Now, link the <span class="caps">GDC</span> sources into the extracted directory and use the provided <code>setup-gcc.sh</code> script to patch <span class="caps">GCC</span> to enable D version 1:</p>
<figure class='code'> <div class="highlight"><pre><span class="nb">cd </span>gcc-4.3.4<br />
ln -s ../../../d gcc/d<br />
gcc/d/setup-gcc.sh —d-language-version<span class="o">=</span>1<br />
</pre></div></figure>
<p>After that, you are ready to build and install <span class="caps">GCC</span> with D support. For this, go to some build directory and <code>configure</code> and <code>make</code>. You can, of course, choose an arbitrary directory for the build files (for instance, I personally prefer having the build files completely outside the source direcotry):</p>
<figure class='code'> <div class="highlight"><pre>mkdir build<br />
<span class="nb">cd </span>build<br />
../configure —enable-languages<span class="o">=</span>d —disable-multilib —disable-shared —prefix<span class="o">=</span>/opt/gdc<br />
make<br />
sudo make install<br />
</pre></div></figure>
<p>Note that I configured <span class="caps">GCC</span>/<span class="caps">GDC</span> to be installed in <code>/opt/gdc</code>. As the build also includes the C compiler, this avoids any interference with the »normal« <span class="caps">GCC</span> installed probably in <code>/usr</code>. After the build has finished – this takes quite long, since <span class="caps">GCC</span> is built three times to bootstrap itself – you should have a working <span class="caps">GDC</span> executable in <code>/opt/gdc/bin</code>. Now for the second part, Tango:</p>
<p>Start off by fetching the Tango sources from the <span class="caps">SVN</span> to a temporary working directory (I worked with revision 5023):</p>
<figure class='code'> <div class="highlight"><pre><span class="nb">cd</span> ~/tmp<br />
svn co http://svn.dsource.org/projects/tango/trunk tango<br />
</pre></div></figure>
<p>Unfortunately, Tango currently does not compile with <span class="caps">GDC</span> out of the box, you have to apply a couple of minor changes: The first change adds build/arch files for <span class="caps">GDC</span>/Linux:</p>
<figure class='code'> <div class="highlight"><pre><span class="gh">diff <del>-git a/build/arch/linux-i686-gdc-dbg.mak b/build/arch/linux-i686-gdc-dbg.mak</span><br />
<span class="gd"></del>— /dev/null</span><br />
<span class="gi"><ins>+</ins> b/build/arch/linux-i686-gdc-dbg.mak</span><br />
<span class="gu"><code>@ -0,0 +1,6 @</code></span><br />
<span class="gi"><ins>include $(<span class="caps">ARCHDIR</span>)/gdc.rules</span><br />
<span class="gi"></ins>include $(<span class="caps">ARCHDIR</span>)/linux.inc</span><br />
<span class="gi"><ins></span><br />
<span class="gi"></ins># -Wall breaks the compilation with wrong errors</span><br />
<span class="gi"><ins>DFLAGS_COMP=-g</span><br />
<span class="gi"></ins>CFLAGS_COMP=-g</span><br />
<br />
<span class="gh">diff <del>-git a/build/arch/linux-i686-gdc-opt.mak b/build/arch/linux-i686-gdc-opt.mak</span><br />
<span class="gd"></del>— /dev/null</span><br />
<span class="gi"><ins>+</ins> b/build/arch/linux-i686-gdc-opt.mak</span><br />
<span class="gu"><code>@ -0,0 +1,5 @</code></span><br />
<span class="gi"><ins>include $(<span class="caps">ARCHDIR</span>)/gdc.rules</span><br />
<span class="gi"></ins>include $(<span class="caps">ARCHDIR</span>)/linux.inc</span><br />
<span class="gi"><ins></span><br />
<span class="gi"></ins>DFLAGS_COMP=-O2</span><br />
<span class="gi">+CFLAGS_COMP=-O2</span><br />
<br />
<span class="gh">diff <del>-git a/build/arch/linux-i686-gdc-tst.mak b/build/arch/linux-i686-gdc-tst.mak</span><br />
<span class="gd"></del>— /dev/null</span><br />
<span class="gi"><ins>+</ins> b/build/arch/linux-i686-gdc-tst.mak</span><br />
<span class="gu"><code>@ -0,0 +1,5 @</code></span><br />
<span class="gi"><ins>include $(<span class="caps">ARCHDIR</span>)/gdc.rules</span><br />
<span class="gi"></ins>include $(<span class="caps">ARCHDIR</span>)/linux.inc</span><br />
<span class="gi"><ins></span><br />
<span class="gi"></ins>DFLAGS_COMP=-g -fdeprecated -fdebug=UnitTest -funittest</span><br />
<span class="gi">+CFLAGS_COMP=-g</span><br />
</pre></div></figure>
<p>The second change removes the <code>-fversion=Posix</code> flag from the Makefile of the runtime because the <span class="caps">DMD</span> frontend <span class="caps">GDC</span> currently uses (1.040) does not allow it to be specified as it is set automatically (this restriction has been lifted in later versions):</p>
<figure class='code'> <div class="highlight"><pre><span class="gh">diff <del>-git a/runtime/compiler/gdc/Makefile.am b/runtime/compiler/gdc/Makefile.am</span><br />
<span class="gd"></del>— a/runtime/compiler/gdc/Makefile.am</span><br />
<span class="gi"><ins>+</ins> b/runtime/compiler/gdc/Makefile.am</span><br />
<span class="gu"><code>@ -18,7 +18,7 @</code></span><br />
# AUTOMAKE_OPTIONS = 1.9.6 foreign no-dependencies<br />
<br />
OUR_CFLAGS=<code>DEFS</code> -I.<br />
<span class="gd">-D_EXTRA_DFLAGS=-nostdinc -pipe -I../../.. -I../shared -fversion=Posix</span><br />
<span class="gi">+D_EXTRA_DFLAGS=-nostdinc -pipe -I../../.. -I../shared</span><br />
ALL_DFLAGS = $(<span class="caps">DFLAGS</span>) $(D_MEM_FLAGS) $(D_EXTRA_DFLAGS) $(<span class="caps">MULTIFLAGS</span>)<br />
<br />
host_alias=.<br />
<span class="gh">diff <del>-git a/runtime/compiler/gdc/Makefile.in b/runtime/compiler/gdc/Makefile.in</span><br />
<span class="gd"></del>— a/runtime/compiler/gdc/Makefile.in</span><br />
<span class="gi"><ins>+</ins> b/runtime/compiler/gdc/Makefile.in</span><br />
<span class="gu"><code>@ -228,7 +228,7 @</code> target_vendor = <code>target_vendor</code></span><br />
top_builddir = <code>top_builddir</code><br />
top_srcdir = <code>top_srcdir</code><br />
OUR_CFLAGS = <code>DEFS</code> -I.<br />
<span class="gd">-D_EXTRA_DFLAGS = -nostdinc -pipe -I../../.. -I../shared -fversion=Posix</span><br />
<span class="gi">+D_EXTRA_DFLAGS = -nostdinc -pipe -I../../.. -I../shared</span><br />
ALL_DFLAGS = $(<span class="caps">DFLAGS</span>) $(D_MEM_FLAGS) $(D_EXTRA_DFLAGS) $(<span class="caps">MULTIFLAGS</span>)<br />
toolexecdir = $(phobos_toolexecdir)<br />
toolexeclibdir = $(phobos_toolexeclibdir)<br />
</pre></div></figure>
<p>The third and last change adds a workaround to Tango’s user library for a bug in the <span class="caps">DMD</span> front-end which has been fixed by now (the compiler fails to resolve the type of the template parameter in the templated <code>intpow</code> function):</p>
<figure class='code'> <div class="highlight"><pre><span class="gh">diff <del>-git a/user/tango/math/internal/BiguintCore.d b/user/tango/math/internal/BiguintCore.d</span><br />
<span class="gd"></del>— a/user/tango/math/internal/BiguintCore.d</span><br />
<span class="gi"><ins>+</ins> b/user/tango/math/internal/BiguintCore.d</span><br />
<span class="gu"><code>@ -516,7 +516,7 @</code> static BigUint pow(BigUint x, ulong y)</span><br />
}<br />
y0 = y/p;<br />
finalMultiplier = intpow(x0, y – y0*p);<br />
<span class="gd">- x0 = intpow(x0, p);</span><br />
<span class="gi">+ x0 = intpow!(BigDigit)(x0, p);</span><br />
}<br />
xlength = 1;<br />
}<br />
</pre></div></figure>
<p>After you have applied these patches, you should be ready to build Tango (make sure that you have a <code>cc</code> somewhere in your <code>PATH</code>, if not, create a link to your system’s <code>gcc</code>):</p>
<figure class='code'> <div class="highlight"><pre>sudo <span class="nv"><span class="caps">PATH</span></span><span class="o">=</span><span class="nv">$<span class="caps">PATH</span></span>:/opt/gdc/bin <span class="nv">DC</span><span class="o">=</span>gdc build/build.sh —lib-install-dir /opt/gdc/lib<br />
</pre></div></figure>
<p>However, I had to remove Phobos’ <code>object.d</code> from <code>/opt/gdc/include/d/4.3.1</code> first. <code>build/build.sh</code> should finish with a note reminding you that the user libraries still have to be installed. To do this, simply copy the contents of the <code>user</code> directory to <code>/opt/gdc/include/d/4.3.1</code> after removing the old include files which are part of Phobos (you have to keep the <code>gcc</code> and <code>i686-pc-linux-gnu</code> directories though). Congratulations, now you should be able to build your Tango projects with <span class="caps">GDC</span>!</p>
<p>A quick tip for <a href="http://www.dsource.org/projects/dsss/"><span class="caps">DSSS</span></a> users: You probably have to modify your <code>gdc-posix-tango</code> profile to omit the <code>-version=Posix</code> switch (see above) on <code>gdmd</code> calls and add <code>-L-ltango-base-gdc</code> to the linker flags since Tango was not installed via <span class="caps">DSSS</span> in the above instructions.</p>
<p class="update">Since I originally wrote this post, Tango’s build system was modified yet another time (at least, things are much simpler now). Instead of fiddling around with the makefiles, just use the <code>bob</code> tool from the <code>build</code> directory now which <em>should</em> work with <span class="caps">GDC</span> out of the box.</p>The Power of Git2009-07-28T00:00:00+01:00http://klickverbot.at/blog/2009/07/the-power-of-git<p>As you might already know if you read my <a href="/blog/2008/08/getting-started-with-git/">blog post</a> about it, I have been using Git quite a while ago now. However, I am still amazed not infrequently by the fact that Git <em>simply works</em>, in the sense that it really does the things you tell it to do.</p>
<p>Recently, for instance, I wanted to merge an extension to the great <a href="http://assimp.sf.net">Open Asset Import Library</a> (bindings for the <a href="http://www.digitalmars.com/d/">D programming language</a>, in fact) which I developed locally in Git to the upstream repository in a way that the commit history was kept locally. However, <span class="caps">SVN</span> is used as <span class="caps">SCM</span> system for upstream development. So I started out by importing the upstream <span class="caps">SVN</span> repository via <code>git-svn</code>:</p>
<figure class='code'> <div class="highlight"><pre><span class="nv">$ </span>mkdir assimp<span class="p">;</span> <span class="nb">cd </span>assimp<br />
<span class="nv">$ </span>git svn init https://assimp.svn.sourceforge.net/svnroot/assimp/trunk<br />
<span class="nv">$ </span>git svn fetch<br />
</pre></div></figure>
<p>Nothing too exciting here. So far, I only created a local Git clone of the <span class="caps">SVN</span> repository which I probably will use for contributing to upstream development in the future. But how to transfer the bindings from the Git repository to this one <em>including</em> their (strictly linear, i.e. master-only) commit history? Because Git does not try to be smarter than its users, the first solution I came up with worked flawlessly. Here is what I did:</p>
<figure class='code'> <div class="highlight"><pre><span class="nv">$ </span>git checkout -b d-bindings<br />
<span class="nv">$ </span>git fetch ../dAssimp<br />
<span class="nv">$ </span>git <span class="nb">read</span>-tree —prefix<span class="o">=</span>port/dAssimp FETCH_HEAD<br />
<span class="nv">$ </span>git rev-parse FETCH_HEAD > .git/MERGE_HEAD<br />
<span class="nv">$ </span>git commit<br />
</pre></div></figure>
<p>After switching to a new branch in which the history should be stored, I told Git to fetch the contents of the local <code>dAssimp</code> repository (the D bindings I developed). Because I had not made any merges, Git simply stored the <code>HEAD</code> of the other repository in <code>FETCH_HEAD</code>. The <code>read-tree</code> command reads, as the name suggests, arbitrary tree information into the index. The <code>--prefix</code> switch allows you to keep the current index and read the tree into an (empty) subdirectory instead – perfect for what I intended to do. Storing the <code>FETCH_HEAD</code>‘s object name into <code>.git/MERGE_HEAD</code> tells Git to generate a merge commit the next time <code>git commit</code> is called. There was just one last thing left to do: as <code>git read-tree</code>, according to the manpage, »does not actually update any of the files it ›caches‹«, a <code>git reset --hard</code> is needed to actually create the new files in the working copy. That’s it.</p>
<p>As I found out later, I could have possibly done this more easy using the <a href="http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html">subtree merge strategy</a>, but I still like, as mentioned above, Git’s feature of »just doing what you tell it to do«…</p>Installing DMD, LDC, Tango and DSSS on (K)Ubuntu Jaunty2009-07-28T00:00:00+01:00http://klickverbot.at/blog/2009/07/installing-dmd-ldc-tango-and-dsss-on-kubuntu-jaunty<p>For quite a while now, I am using the <a href="http://en.wikipedia.org/wiki/D_(programming_language)">D programming language</a>, version 1 (I have not looked at D2 yet, it is said to be still rather unstable). Even though I like it very much for its syntactical quality and the language itself is reasonably mature, I must admit that setting up the toolchain correctly can still be a very cumbersome task to do, especially when you are new to D.</p>
<p>This post describes an installation routine that should provide you with a working D development environment containing <span class="caps">DMD</span>, <span class="caps">LDC</span>, Tango and <span class="caps">DSSS</span> on (K)Ubuntu Jaunty. Please note that it assumes your system to be »clean« – if you have already installed any D-related software, it is probably advisable to remove it completely to prevent any problems with, for instance, stale files. <!--more--></p>
<figure class='code'> <div class="highlight"><pre>mkdir -p ~/tmp<br />
<span class="nb">cd</span> ~/tmp<br />
<br />
wget http://ftp.digitalmars.com/dmd.1.050.zip<br />
unzip dmd.1.050.zip<br />
<span class="nb">cd </span>dmd/linux/bin/<br />
chmod +x dmd dumpobj obj2asm rdmd<br />
sudo cp dmd dmd.conf dumpobj obj2asm rdmd /usr/local/bin/<br />
<br />
<span class="nb">cd</span> ~/tmp<br />
svn co http://svn.dsource.org/projects/tango/trunk tango<br />
<br />
<span class="nb">cd </span>tango/<br />
sudo <span class="nv">DC</span><span class="o">=</span>dmd build/build.sh —lib-install-dir /usr/local/lib<br />
sudo cp -rf user/object.di user/rt user/std user/tango /usr/local/include/d/<br />
<br />
sudo su -c <span class="s1">'echo -e "[Environment]\nDFLAGS=-I/usr/local/include/d -defaultlib=tango-base-dmd -debuglib=tango-base-dmd -L-ltango-user-dmd -version=Tango -version=Posix" > /usr/local/bin/dmd.conf'</span><br />
<br />
sudo su -c <span class="s1">'echo -e "# dsss\ndeb http://ppa.launchpad.net/d-language-packagers/ppa/ubuntu jaunty main" >> /etc/apt/sources.list'</span><br />
sudo apt-get install dsss<br />
sudo su -c <span class="s1">'echo "profile=dmd-posix-tango" > /etc/drebuild/default'</span><br />
</pre></div></figure>
<p>You should now be able to build your D/Tango programs with <span class="caps">DMD</span> and <span class="caps">DSSS</span>.</p>
<p>I would suggest giving <a href="http://www.dsource.org/projects/ldc"><span class="caps">LDC</span></a> at least a short glance, a fairly young compiler project which leverages <a href="http://llvm.org/"><span class="caps">LLVM</span></a> as its code generating backend. It is maturing very quickly and allows you to make use of the various features the <span class="caps">LLVM</span> compiler infrastructure provides, the most noticeable probably being its excellent optimization routines. Fortunately, there are current binary packages available at launchpad, so all that is needed to <span class="caps">LDC</span> is:</p>
<figure class='code'> <div class="highlight"><pre>sudo su -c <span class="s1">'echo -e "# ldc-daily\ndeb http://ppa.launchpad.net/d-language-packagers/ppa/ubuntu karmic main\ndeb http://archive.ubuntu.com/ubuntu karmic main universe" >> /etc/apt/sources.list'</span><br />
sudo apt-get update<br />
<br />
sudo apt-get install ldc-daily libtango-ldc-daily-dev<br />
<br />
<span class="nb">cd</span> ~/tmp<br />
wget -O ldc-posix-tango http://www.dsource.org/projects/ldc/browser/ldc-posix-tango?format<span class="o">=</span>raw<br />
sudo su -c <span class="s1">'sed "s:ldc.rebuild.conf:/etc/ldc/ldc.rebuild.conf:" /etc/drebuild/ldc-posix-tango'</span><br />
<br />
sudo su -c <span class="s1">'echo "profile=ldc-posix-tango" > /etc/drebuild/default'</span><br />
</pre></div></figure>
<p>Note that the above commands install a daily snapshot of <span class="caps">LDC</span>, which I would recommend to use due to the currently fast development of <span class="caps">LDC</span>. In order not to break you Jaunty installation, please <em><strong>do not forget</strong> to comment out the official »karmic« repositories</em> (which contain some dependencies for <code>ldc-daily</code>) in your <code>/etc/apt/sources.list</code> and run <code>apt-get update</code> after the installation is completed.</p>
<p>Both compilers are set up to use Tango, do <em>not</em> install Tango via <span class="caps">DSSS</span>! If you want to switch compilers, just activate the corresponding profile in <code>/etc/drebuild/default</code> and do not forget to rebuild any D libraries you might have compiled and installed with the old compiler (just run <code>dsss net install … </code>for the ones you installed using <span class="caps">DSSS</span>).</p>
<p class="update">Since I wrote this post, Tango received yet another big structural change to its codebase (amongst other changes, the core and user libraries have been merged). Now, you should use the supplied <em>»bob«</em> tool now to build tango. Additionally, Karmic is now stable so you might have to adapt the <span class="caps">APT</span> repository-related instructions.</p>Getting KDE's clippboard to work with Eclipse2009-05-28T00:00:00+01:00http://klickverbot.at/blog/2009/05/getting-kdes-clippboard-to-work-with-eclipse<p>For whatever reason, Eclipse does not work well with Klipper, the <span class="caps">KDE</span> clipboard manager, in its default settings. The symptoms: Quite often, you copy a piece of text to the clipboard. When you try to paste it, it miraculously disappears and some old piece of clipboard content is pasted instead.</p>
<p>After simply trying to ignore the problem for some time, I searched and found a solution today: You have to disable the <em>»Prevent empty clipboard«</em> setting in Klipper’s configuration menu (which is accessible by right-clicking on the systray icon).</p>
<p>Intuitive? Not to me…</p>
<p class="update">Disabling the mentioned option might introduce some minor glitches to general clipboard usage (sometimes, the clipboard seems to empty itself). As those occur rather infrequently, I have not been able to find out why this happens, or even if this is connected to the configuration changes described in this post.</p>Duplicate »Translucency« KWin effect2009-04-28T00:00:00+01:00http://klickverbot.at/blog/2009/04/duplicate-translucency-kwin-effect<p>For some time, the effect »Translucency« was listed twice in the KWin <span class="caps">KCM</span> plugin list of my local <span class="caps">KDE</span> setup (<span class="caps">SVN</span> trunk). One copy was actually working, the other was just producing error messages.</p>
<p>Today, I finally had time to investigate the issue: The problem was caused by a stale <code>.desktop</code> file in <code>share/kde4/services/kwin</code> with the old name of the plugin (it was renamed from <em>maketransparent</em> to <em>translucency</em>).</p>
<p>I have no idea how this could happen, because I usually purge the whole <code>/opt/kde</code> folder everytime I <code>svn up</code> my <code>qt-copy</code>, which I happen to do quite often…</p>Strange segfaults when compiling with GDC2009-02-04T00:00:00+00:00http://klickverbot.at/blog/2009/02/strange-segfaults-when-compiling-with-gdc<p>Use -no-export-dynamic to prevent segfaults in external libraries when linking with <span class="caps">GDC</span>.</p>
<p>More to come soon…</p>ZoneAlarm Firewall + Windows Vista = No Good2009-02-01T00:00:00+00:00http://klickverbot.at/blog/2009/02/zonealarm-firewall-windows-vista-no-good<p>I am terribly short on time at the moment, but I just have to record this:<br />
<em>Don’t use the Zone Labs ZoneAlarm Firewall on Microsoft Windows Vista. Just don’t do it!</em></p>
<p>For me, the firewall caused numerous seemingly unrelated problems: the Windows Update panel froze when started (had to kill explorer.exe), the Windows Defender updates were not working anymore, I could not install Visual Studio (Visual C++ 2008 Express to be precise), etc.</p>
<p>I will certainly write more about this somewhen in the future…</p>
<p class="update">Apparently, Microsoft have already covered this issue in a <a href="http://support.microsoft.com/kb/321434">knowledge base entry</a> almost two years ago. D’oh! I will probably do some testing on this during the next days.</p>Debugging KDE applications with gdb2009-01-24T00:00:00+00:00http://klickverbot.at/blog/2009/01/debugging-kde-applications-with-gdb<p>A pretty long time has passed since my last post here, and in the meantime I have jumped right into <span class="caps">KDE</span> development myself. I have stumbled upon quite a few tricks and pitfalls that I will undoubtedly write about here some day.</p>
<p>For now, I just want to share a little gem which I have discovered a few minutes ago: In <code>kdesdk/scripts</code>, David Faure published a little script called <code>kde-devel-gdb</code> (<a href="http://websvn.kde.org/trunk/KDE/kdesdk/scripts/kde-devel-gdb?view=markup">view with WebSVN</a>), which extends gdb with the ability to print the contents of several Qt containers, including the widely used <code>QString</code>. Highly recommended!</p>KHotkeys in KDE 4.12008-11-02T00:00:00+00:00http://klickverbot.at/blog/2008/11/khotkeys-in-kde-4-1<p>I just upgraded to Kubuntu 8.10 in order to have a look at shiny new <span class="caps">KDE</span> 4. But amongst other minor annoyances, I had real trouble getting hotkeys to work. There are configuration options in the new System Settings panel (which is a huge regression compared to KControl by the way), but they seemed to have no effect.</p>
<p>After various attempts of fixing this problem myself, I finally found a (slightly hackish) solution: <a href="http://ubuntuforums.org/showpost.php?p=5541572&postcount=9">Re: Does khotkeys work on your <span class="caps">KDE</span> 4.1?</a>. Seems like <span class="caps">KDE</span> 4.1 is still <em>very</em> beta…</p>Git on Windows2008-10-12T00:00:00+01:00http://klickverbot.at/blog/2008/10/git-on-windows<p>Just found <a href="http://kylecordes.com/2008/04/30/git-windows-go/">an excellent writeup</a> by Kyle Cordes about using Git on Windows.</p>KDE Konsole corruption when Compiz is active2008-09-03T00:00:00+01:00http://klickverbot.at/blog/2008/09/kde-konsole-corruption-when-compiz-is-active<p>On my laptop I’m currently running Kubuntu 8.04 (Hardy Heron). For additional eye candy goodness I am using the <code>compiz-fusion</code> package from the Kubuntu repositories. Surprisingly, even on the laptop hardware (Asus V1S) everything went smooth out of the box – I could even manage to find some drivers for the webcam and for the finger print scanner.</p>
<p>Well, everything worked fine except for one little detail: When I had an active Konsole session on one cube face, for example a log file, and continued to work on another side of the cube, the Konsole output would often be broken when I switched back to it. It would look as if the output had been scrolled, but the non-scrolled output hadn’t been cleared from the window. This problem could be fixed by forcing the window to refresh, e.g. by switching to another (Konsole) tab.</p>
<p>In their 169.XX driver series, nVidia added a config option called <code>UseCompositeWrapper</code>, which can help to sort out this kind of redraw problems. Fortunately, enabling this via adding the following line to the <code>Device</code> section of my <code>xorg.conf</code> was enough to solve the problem:</p>
<pre><code>Option "UseCompositeWrapper" "true"
</code></pre>The Tangled Working Copy Problem2008-08-25T00:00:00+01:00http://klickverbot.at/blog/2008/08/the-tangled-working-copy-problem<p>There is a problem which probably about every developer using a revision control management for their hobby projects has already experienced: The annoying situation of having two or more completely unrelated changes in your working tree. With its “index” feature, git provides an excellent, if not quite intuitive facility for solving the problem.</p>
<p><a href="http://tomayko.com/writings/the-thing-about-git">This great article</a> by Ryan Tomayko is quite a comprehensive description of this problem itself (he calls it “The Tangeled Working Copy Problem”) as well as a detailed guide on how to solve it using git.</p>Getting started with git2008-08-02T00:00:00+01:00http://klickverbot.at/blog/2008/08/getting-started-with-git<p>Recently, I decided to have a look at the revision control system <a href="http://git-scm.com">Git</a> and the »social code hosting service« <a href="http://github.com">GitHub</a>, which is currently hyped in large parts of the open source community (git is used for the Linux kernel, Ruby on Rails, …). At first, I felt pretty overwhelmed, because the differences to other SCMs like Subversion were bigger than I had expected. But now, having read the excellent blog post <a href="http://b.lesseverything.com/2008/3/25/got-git-howto-git-and-github">Got git?</a> by Steven Bristol, I am starting to understand the concepts and the motivation behind it.</p>
<p>Frankly, I am still not quite sure if the “doing it distributed” paradigm will really establish itself in everyday coding work, but for open source projects git looks very promising at least.</p>NoInternetOpenWith2008-06-26T00:00:00+01:00http://klickverbot.at/blog/2008/06/nointernetopenwith<p>At the moment, I am forced to use a box running Windows Vista for parts of my daily work. One thing that really annoys me about Windows is that nasty dialog asking you if you want to search the internet which pops up when you want to open a file with an extension Windows doesn’t know about. I am sure that 99 percent of the users would prefer to directly jump into the window where you can choose the application. But well, that’s just Microsoft, I guess.</p>
<p>Today, I finally found a tweak that removes that useless dialog:<br />
Add a dword called <code>NoInternetOpenWith</code> with the value <code>0x1</code> under <code>HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer</code>.</p>
<p>If you do not want to poke around manually in your registry, you can copy the following lines in a new <code>.reg</code> file and double-click it to add the key.</p>
<pre><code>Windows Registry Editor Version 5.00</code>
<code>[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer]
"NoInternetOpenWith"=dword:00000001
</code></pre>File encoding in Eclipse2008-06-20T00:00:00+01:00http://klickverbot.at/blog/2008/06/file-encoding-in-eclipse<p>While coding my first Rails application in Eclipse using RadRails, I found that my <span class="caps">HTML</span> files were not encoded in the correct format. Although I knew that you <em>can</em> specify the default file encoding (e.g. <span class="caps">UTF</span>-8) in Eclipse, it took me quite some time to find the corresponding option.</p>
<p>After looking for it for way too long (I guess it felt longer than it was, though), I finally found it in <code>Window > Preferences > General > Workspace</code>.</p>Distributing Rails Applications2008-06-20T00:00:00+01:00http://klickverbot.at/blog/2008/06/distributing-rails-applications<p>I just found <a href="http://www.erikveen.dds.nl/distributingrubyapplications/rails.html">this little tutorial</a> on how to package your Rails application into a neat archive for showing it to your customers.</p>
<p>I guess it will come in quite handy for me in a few weeks because I am developing a small application for a friend of mine at the moment, who is not a developer, but wants to run the application locally on his Windows box.</p>