Shared posts

01 Dec 17:23

Life of an instruction in LLVM

by Eli Bendersky

LLVM is a complex piece of software. There are several paths one may take on the quest of understanding how it works, none of which is simple. I recently had to dig in some areas of LLVM I was not previously familiar with, and this article is one of the outcomes of this quest.

What I aim to do here is follow the various incarnations an "instruction" takes when it goes through LLVM’s multiple compilation stages, starting from a syntactic construct in the source language and until being encoded as binary machine code in an output object file.

This article in itself will not teach one how LLVM works. It assumes some existing familiarity with LLVM’s design and code base, and leaves a lot of "obvious" details out. Note that unless otherwise stated, the information here is relevant to LLVM 3.2. LLVM and Clang are fast-moving projects, and future changes may render parts of this article incorrect. If you notice any discrepancies, please let me know and I’ll do my best to fix them.


Input code

I want to start this exploration process at the beginning – C source. Here’s the simple function we’re going to work with:


int foo(int aa, int bb, int cc) {
int sum = aa + bb;
return sum / cc;
}

The focus of this article is going to be on the division operation.

Clang

Clang serves as the front-end for LLVM, responsible for converting C, C++ and ObjC source into LLVM IR. Clang’s main complexity comes from the ability to correctly parse and semantically analyze C++; the flow for a simple C-level operation is actually quite straightforward.

Clang’s parser builds an Abstract Syntax Tree (AST) out of the input. The AST is the main "currency" in which various parts of Clang deal. For our division operation, a BinaryOperator node is created in the AST, carrying the BO_div "operator kind". Clang’s code generator then goes on to emit a sdiv LLVM IR instruction from the node, since this is a division of signed integral types.

LLVM IR

Here is the LLVM IR created for the function:

define i32 @foo(i32 %aa, i32 %bb, i32 %cc) nounwind {
entry:
%add = add nsw i32 %aa, %bb
%div = sdiv i32 %add, %cc
ret i32 %div
}

In LLVM IR, sdiv is a BinaryOperator, which is a subclass of Instruction with the opcode SDiv. Like any other instruction, it can be processed by the LLVM analysis and transformation passes. For a specific example targeted at SDiv, take a look at SimplifySDivInst. Since all through the LLVM "middle-end" layer the instruction remains in its IR form, I won’t spend much time talking about it. To witness its next incarnation, we’ll have to look at the LLVM code generator.

The code generator is one of the most complex parts of LLVM. Its task is to "lower" the relatively high-level, target-independent LLVM IR into low-level, target-dependent "machine instructions" (MachineInstr). On its way to a MachineInstr, an LLVM IR instruction passes through a "selection DAG node" incarnation, which is what I’m going to discuss next.

SelectionDAG node

Selection DAG nodes are created by the SelectionDAGBuilder class acting "in the service of" SelectionDAGISel, which is the main base class for instruction selection. SelectionDAGISel goes over all the IR instructions and calls the SelectionDAGBuilder::visit dispatcher on them. The method handling a SDiv instruction is SelectionDAGBuilder::visitSDiv. It requests a new SDNode from the DAG with the opcode ISD::SDIV, which becomes a node in the DAG.

The initial DAG constructed this way is still only partially target dependent. In LLVM nomenclature it’s called "illegal" – the types it contains may not be directly supported by the target; the same is true for the operations it contains.

There are a couple of ways to visualize the DAG. One is to pass the -debug flag to llc, which will cause it to create a textual dump of the DAG during all the selection phases. Another is to pass one of the -view options which causes it to dump and display an actual image of the graph (more details in the code generator docs). Here’s the relevant portion of the DAG showing our SDiv node, right after DAG creation (the sdiv node is in the bottom):




Before the SelectionDAG machinery actually emits machine instructions from DAG nodes, these undergo a few other transformations. The most important are the type and operation legalization steps, which use target-specific hooks to convert all operations and types into ones that the target actually supports.

"Legalizing" sdiv into sdivrem on x86

The division instruction (idiv for signed operands) of x86 computes both the quotient and the remainder of the operation, and stores them in two separate registers. Since LLVM’s instruction selection distinguishes between such operations (called ISD::SDIVREM) and division that only computes the quotient (ISD::SDIV), our DAG node will be "legalized" during the DAG legalization phase when the target is x86. Here’s how it happens.

An important interface used by the code generator to convey target-specific information to the generally target-independent algorithms is TargetLowering. Targets implement this interface to describe how LLVM IR instructions should be lowered to legal SelectionDAG operations. The x86 implementation of this interface is X86TargetLowering. In its constructor it marks which operations need to be "expanded" by operation legalization, and ISD::SDIV is one of them. Here’s an interesting comment from the code:


// Scalar integer divide and remainder are lowered to use operations that
// produce two results, to match the available instructions. This exposes
// the two-result form to trivial CSE, which is able to combine x/y and x%y
// into a single instruction.
 
When SelectionDAGLegalize::LegalizeOp sees the Expand flag on a SDIV node it replaces it by ISD::SDIVREM. This is an interesting example to demonstrate the transformation an operation can undergo while in the selection DAG form.

Here is the relevant portion of the DAG after legalization:


Instruction selection – from SDNode to MachineSDNode

The next step in the code generation process is instruction selection. LLVM provides a generic table-based instruction selection mechanism that is auto-generated with the help of TableGen. Many target backends, however, choose to write custom code in their SelectionDAGISel::Select implementations to handle some instructions manually. Other instructions are then sent to the auto-generated selector by calling SelectCode.

The X86 backend handles ISD::SDIVREM manually in order to take care of some special cases and optimizations. The DAG node created at this step is a MachineSDNode, a subclass of SDNode which holds the information required to construct an actual machine instruction, but still in DAG node form. At this point the actual X86 instruction opcode is selected – X86::IDIV32r in our case.

Scheduling and emitting a MachineInstr

The code we have at this point is still represented as a DAG. But CPUs don’t execute DAGs, they execute a linear sequence of instructions. The goal of the scheduling step is to linearize the DAG by assigning an order to its operations (nodes). The simplest approach would be to just sort the DAG topologically, but LLVM’s code generator employs clever heuristics (such as register pressure reduction) to try and produce a schedule that would result in faster code.
Each target has some hooks it can implement to affect the way scheduling is done. I won't dwell on this topic here, however.

Finally, the scheduler emits a list of instructions into a MachineBasicBlock, using InstrEmitter::EmitMachineNode to translate from SDNode. The instructions here take the MachineInstr form ("MI form" from now on), and the DAG can be destroyed.
We can examine the machine instructions emitted in this step by calling llc with the -print-machineinstrs flag and looking at the first output that says "After instruction selection":


# After Instruction Selection:
# Machine code for function foo: SSA
Function Live Ins: %EDI in %vreg0, %ESI in %vreg1, %EDX in %vreg2
Function Live Outs: %EAX

BB#0: derived from LLVM BB %entry
Live Ins: %EDI %ESI %EDX
%vreg2<def> = COPY %EDX; GR32:%vreg2
%vreg1<def> = COPY %ESI; GR32:%vreg1
%vreg0<def> = COPY %EDI; GR32:%vreg0
%vreg3<def,tied1> = ADD32rr %vreg0<tied0>, %vreg1, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg0,%vreg1
%EAX<def> = COPY %vreg3; GR32:%vreg3
CDQ %EAX<imp-def>, %EDX<imp-def>, %EAX<imp-use>
IDIV32r %vreg2, %EAX<imp-def>, %EDX<imp-def,dead>, %EFLAGS<imp-def,dead>, %EAX<imp-use>, %EDX<imp-use>; GR32:%vreg2
%vreg4<def> = COPY %EAX; GR32:%vreg4
%EAX<def> = COPY %vreg4; GR32:%vreg4
RET

# End machine code for function foo.
 
Note that the output mentions that the code is in SSA form, and we can see that some registers being used are "virtual" registers (e.g. %vreg1).

Register allocation – from SSA to non-SSA machine instructions

Apart from some well-defined exceptions, the code generated from the instruction selector is in SSA form. In particular, it assumes it has an infinite set of "virtual" registers to act on. This, of course, isn’t true. Therefore, the next step of the code generator is to invoke a "register allocator", whose task is to replace virtual by physical registers, from the target’s register bank.

The exceptions mentioned above are also important and interesting, so let’s talk about them a bit more.

Some instructions in some architectures require fixed registers. A good example is our division instruction in x86, which requires its inputs to be in the EDX and EAX registers. The instruction selector knows about these restrictions, so as we can see in the code above, the inputs to IDIV32r are physical, not virtual registers. This assignment is done by X86DAGToDAGISel::Select.

The register allocator takes care of all the non-fixed registers. There are a few more optimization (and pseudo-instruction expansion) steps that happen on machine instructions in SSA form, but I’m going to skip these. Similarly, I’m not going to discuss the steps performed after register allocation, since these don’t change the basic form operations appear in (MachineInstr, at this point). If you’re interested, take a look at TargetPassConfig::addMachinePasses.

Here's the MIs dumped after register allocation:


# After Virtual Register Rewriter:
# Machine code for function foo: Post SSA
Function Live Ins: %EDI in %vreg0, %ESI in %vreg1, %EDX in %vreg2
Function Live Outs: %EAX

0B BB#0: derived from LLVM BB %entry
Live Ins: %EDI %ESI %EDX
16B %ECX = COPY %EDX
64B %EAX = LEA64_32r %EDI, 1, %ESI, 0, %noreg
96B CDQ %EAX, %EDX, %EAX
112B IDIV32r %ECX, %EAX, %EDX, %EFLAGS, %EAX, %EDX
160B RET %EAX

# End machine code for function foo.

Emitting code

So we now have our original C function translated to MI form – a MachineFunction filled with instruction objects (MachineInstr). This is the point at which the code generator has finished its job and we can emit the code. In current LLVM, there are two ways to do that. One is the (legacy) JIT which emits executable, ready-to-run code directly into memory. The other is MC, which is an ambitious object-file-and-assembly framework that’s been part of LLVM for a couple of years, replacing the previous assembly generator. MC is currently being used for assembly and object file emission for all (or at least the important) LLVM targets. MC also enables "MCJIT", which is a JIT-ting framework based on the MC layer. This is why I’m referring to LLVM’s JIT module as legacy.

I will first say a few words about the legacy JIT and then turn to MC, which is more universally interesting.

The sequence of passes to JIT-emit code is defined by LLVMTargetMachine::addPassesToEmitMachineCode. It calls addPassesToGenerateCode, which defines all the passes required to do what most of this article has been talking about until now – turning IR into MI form. Next, it calls addCodeEmitter, which is a target-specific pass for converting MIs into actual machine code. Since MIs are already very low-level, it’s fairly straightforward to translate them to runnable machine code. The x86 code for that lives in lib/Target/X86/X86CodeEmitter.cpp. For our division instruction there’s no special handling here, because the MachineInstr it’s packaged in already contains its opcode and operands. It is handled generically with other instructions in emitInstruction.

MCInst

When LLVM is used as a static compiler (as part of clang, for instance), MIs are passed down to the MC layer which handles the object-file emission (it can also emit textual assembly files). Much can be said about MC, but that would require an article of its own. A good reference is this post from the LLVM blog. I will keep focusing on the path a single instruction takes.

LLVMTargetMachine::addPassesToEmitFile is responsible for defining the sequence of actions required to emit an object file. The actual MI-to-MCInst translation is done in the EmitInstruction of the AsmPrinter interface. For x86, this method is implemented by X86AsmPrinter::EmitInstruction, which delegates the work to the X86MCInstLower class. Similarly to the JIT path, there is no special handling for our division instruction at this point, and it’s treated generically with other instructions.

By passing -show-mc-inst and -show-mc-encoding to llc, we can see the MC-level instructions it creates with their encoding, alongside the actual assembly code:


foo: # @foo
# BB#0: # %entry
movl %edx, %ecx # encoding: [0x89,0xd1]
# <MCInst #1483 MOV32rr
# <MCOperand Reg:46>
# <MCOperand Reg:48>>
leal (%rdi,%rsi), %eax # encoding: [0x8d,0x04,0x37]
# <MCInst #1096 LEA64_32r
# <MCOperand Reg:43>
# <MCOperand Reg:110>
# <MCOperand Imm:1>
# <MCOperand Reg:114>
# <MCOperand Imm:0>
# <MCOperand Reg:0>>
cltd # encoding: [0x99]
# <MCInst #352 CDQ>
idivl %ecx # encoding: [0xf7,0xf9]
# <MCInst #841 IDIV32r
# <MCOperand Reg:46>>
ret # encoding: [0xc3]
# <MCInst #2227 RET>
.Ltmp0:
.size foo, .Ltmp0-foo


The object file (or assembly code) emission is done by implementing the MCStreamer interface. Object files are emitted by MCObjectStreamer, which is further subclassed according to the actual object file format. For example, ELF emission is implemented in MCELFStreamer. The rough path a MCInst travels through the streamers is MCObjectStreamer::EmitInstruction followed by a format-specific EmitInstToData. The final emission of the instruction in binary form is, of course, target-specific. It’s handled by the MCCodeEmitter interface (for example X86MCCodeEmitter). While in the rest of LLVM code is often tricky because it has to make a separation between target-independent and target-specific capabilities, MC is even more challenging because it adds another dimension – different object file formats. So some code is completely generic, some code is format-dependent, and some code is target-dependent.

Assemblers and disassemblers

A MCInst is deliberately a very simple representation. It tries to shed as much semantic information as possible, keeping only the instruction opcode and list of operands (and a source location for assembler diagnostics). Like LLVM IR, it’s an internal representation will multiple possible encodings. The two most obvious are assembly (as shown above) and binary object files.

llvm-mc is a tool that uses the MC framework to implement assemblers and disassemblers. Internally, MCInst is the representation used to translate between the binary and textual forms. At this point the tool doesn’t care which compiler produced the assembly / object file.

Conclusion


Presenting a "big picture" view of a complex system as LLVM isn't easy. I hope that this article succeeds in giving some clues about the internal workings of LLVM in a way that is useful for further exploration.
 
[This article is re-posted in a slightly expanded form from here]


01 Dec 17:12

The Comical Evolution of Actors and Characters in Film

by Pinar
Johnny Depp

Los Angeles-based illustrator Jeff Victor gives us all a timeline of the cast of characters big-time celebrities like Johnny Deep and Tom Hanks have played over the course of their careers in his Evolution of… series. Each illustration is like a fun little game that triggers one's memory of the great films of yesteryear that have introduced these fresh-faced actors. It's especially entertaining for film fanatics who can put their movie knowledge to the test by identifying movie character names with only the year of the film and an image of the actor in costume.

In addition to actors, the series includes a timeline of the different versions of Batman over the years as well as the varied looks of Back to the Future character Biff Tannen as the films travel through time. Pop culture enthusiasts rejoice!


Tom Hanks


Bill Murray


Jack Nicholson


Uma Thurman


Batman


Biff Tannen from Back to the Future


Sigourney Weaver


Kurt Russell


Rick Moranis


Natalie Portman

Jeff Victor website
via [reddit]

01 Dec 17:08

Awestruck Silhouettes Gazing at Spectacular Night Skies

by Pinar


Have you ever looked up at the night sky and been mesmerized by its beauty? That awe-inspiring moment when everything goes quiet and it's only you and the speckled night sky gazing back at each other produces an indescribable feeling. A sort of freeing effect where time stands still and you're immersed in the nighttime's allure. Somehow, Vancouver-based photographer Elizabeth Gadd manages to capture the essence of this transcendent experience in her series known as Breathe.

The collection of images feature several silhouettes against spectacular backdrops. Each shot interprets the ethereal experience of getting lost in the world's beauty that seems almost too perfect to put into words. There's the recurring visual pattern of scattered dots of light—whether they appear in the form of distant stars, a flurry of snow, or the unfocused beads of light from an urban skyline—that is both calming and enchanting. Gadd describes the series simply as "Silence. Awe. Beauty. Peace of mind."





Elizabeth Gadd website
Elizabeth Gadd on Flickr
via [My Darkened Eyes]

01 Dec 16:44

8x8x8 LED Cube

by Alan Parekh

 

If you have been listening to the latest episodes of The Amp Hour you have probably heard of Club Jameco. Chris from Pyroelectro has checkout it out, he purchased the 8X8X8 LED Cube to check it out.  It is Arduino based and looks like a well thought of design. If you decide to give it a try it looks like you will need to set aside a large amount of time since the LEDs cube needs to be all soldered carefully by hand and the circuit board is perf-board based. I think a custom PCB would make this kit quite a bit better.

Check out the kit build here.

 

30 Nov 22:20

Construindo uma guitarra: Parte 1

by admin

Após anos admirando vários modelos de guitarras fender: Jaguar, Jazzmaster e Mustang, e tendo sempre em vista que um modelo original, vintage, americano, legal, estadual e bonito dessas guitarras vale mais grana que muito rim por aí, resolvi  tomar uma decisão: por que não fazer um curso de luthieria e construir a minha própria guitarra?

Ah, você é louco? Vai ficar uma merda essa guitarra, provavelmente um pedaço de pau quadrado e tosco… – Amigo fdp

Acontece que eu resolvi tentar, e mesmo que demorasse o intervalo de umas 5 guitarras até eu conseguir ter algo descente e proto-tocável, fui em frente e abracei a causa. O que vai acontecer aqui nesse post e nos próximos, é um relato dessas minhas aventuras com madeiras e outros dispositivos eletrônicos.

Curso de Luthieria

Para trabalhar com madeira de maneira confortável é indispensável o acesso a uma oficina. Eu optei por um caminho mais fácil e resolvi fazer um curso para poder aprender a usar corretamente ferramentas e aprender truques de quem já sabe o que faz. A idéia do curso foi construir uma guitarra do zero, de maneira manual, e realizar todas as etapas até completar o instrumento. O luthier que deu o curso foi o Ricardo Lucas (Ricky Lucas) de Campinas/SP. Muito gente boa, sabe bastante do que ensina e é um guitarrista foda.

A escolha de um modelo

Foi bem difícil escolher qual seria o modelo de construção da primeira guita. Existem guitarras infinitamente belas por aí, e outras infinitamente complexas. Eu queria escolher uma que não fosse muito complicada e ao mesmo tempo bonita. Além disso, a escolha da cor foi outro fardo enorme, mas que pode ser postergado para o fim, já que pintar é uma das últimas etapas. Após diversos filtros, fiquei entre dois modelos: fazer um clone de uma Jaguar ou de uma Jazzmaster. Abaixo exemplo de cada uma delas:

Fender Jazzmaster modelo J. Massis (Dinosaur Jr.)

Fender Jaguar, com “matching headstock”

Decidi construir um clone de uma Fender Jaguar. A quantidade de parafernalhas metálicas no corpo sempre me chamou a atenção e terminou por me conquistar.

Escolha das madeiras

Esse é um assunto complexo e que poderia levar dias de especulação, porque diferentes madeiras resultam em timbres distintos e outros aspectos tonais que eu não entendo bem e não me atrevo a elaborar um discurso sobre. Para os mais curiosos, existe um trabalho de conclusão de curso da UNB sobre a Avaliação de madeiras brasileiras para utilização em guitarras elétricas. Serei bem direto: optei por utilizar madeiras brasileiras, eram as de mais fácil acesso, menor custo e de qualidade razoável. Outro fator é que, sendo esta guitarra, a primeira construída por mim mesmo, investir em madeiras caras podeira terminar mal caso eu realizasse algum erro grotesco e incorrigivel no processo de construção. As madeiras utilizadas foram:

  1. Corpo: Marupá
  2. Braço: Pau Marfim
  3. Escala: Jacarandá da Bahia (Brazilian Rosewood)

As duas primeiras são madeiras brancas e a última escura.

Modelo do corpo

Para construção do corpo, o primeiro passo é tirar um molde do corpo da guitarra alvo. É também possível usar um gabarito pronto – um tampo de madeira que contenha o contorno desejado. Eu não tinha um corpo de Jaguar à disposição e muito menos um gabarito. Minha solução foi procurar na internet o modelo e acabei encontrando em dois lugares diferentes: em um forum da Short Scale e em um site alemão. O arquivo com o modelo da Jaguar, estava em um formato vetorial dwg, usado pelo AutoCAD (abaixo um screenshot da visualização).

A idéia aqui é simples: exportar esse arquivo para pdf usando uma escala real e levar em uma gráfica para impressão em uma folha A2 ou A1. Eu acabei escolhendo a escala um pouco errado, e isso ocasionou um corpo um pouco menor do que o original (como mostrarei em posts futuros).

Desenhando e cortando o corpo

Esse papel com os contornos será usado com papel carbono para passarmos os moldes para a madeira do corpo.

Após realizar o contorno por cima do papel carbono, retirar o papel e melhorar os contornos à mao com uma regua de curva francesa como essa abaixo. Essa régua possui todos os contornos necessários para produzir um bom modelo de guitarra, basta ter paciencia de procurar a parte certa para cada trecho do contorno (principalmente nas partes de cut-off do corpo).

Pronto, agora temos o corpo pronto para passar na serra de fita:

No próximo post explicarei o sobre os 2 pedaços colados, como realizar a junção e partiremos então para o uso da serra de fita.

28 Nov 19:37

Gaining Confidence in Confidence Intervals

by regehr

Many computer scientists, including myself, end up doing research that needs to be evaluated experimentally, and yet most of us have had little training in statistics or the design of experiments. We end up teaching ourselves this material as we go, which often works pretty well. On the other hand, sometimes the issues are surprisingly subtle. Today’s piece is about a example where I’ve found this to be the case; I ran across it a few years ago when some sanity checks on confidence interval code weren’t giving answers that seemed to make sense.

The topic is confidence intervals for a binomial proportion, where we take a series of independent yes/no measurements and use them to compute an interval that most likely contains the true proportion. For example, if we flip a coin 50 times and get heads 28 times, should we conclude that the coin is biased towards heads? In other words, does the confidence interval for the proportion of heads exclude 0.5? In this case, the 95% interval does contain 0.5 and so the data does not support the conclusion that the coin is unfair at the 95% level.

If we reach some conclusion with 95% confidence, what does that actually mean? Informally, it means that if we repeated the entire experiment (in this case, 50 coin flips) many times, we would reach the correct conclusion 95% of the time. The other 5% of the time we would be incorrect. Statisticians formalize this as the coverage probability: “the proportion of the time that the interval contains the true value of interest.” At this point most of us would ask: Doesn’t the 95% confidence interval have a coverage probability of 95% by definition? Turns out the answer is “no” and this is where the fun starts.

Let’s take an example where n (the sample size) is 21 and p (the true proportion) is 0.479. We can empirically compute the coverage of the 95% confidence interval by repeatedly taking 21 random samples and seeing if the true proportion lies within the 95% confidence interval. For now we’ll use the normal approximation to the binomial. Here’s some code that does all this. Compile and run it like so:

$ gcc -O conf.c -o conf -lm
$ ./conf 21 0.479
n= 21, p= 0.479000, nominal coverage= 95.00 %
measured coverage= 91.62 % 
measured coverage= 91.63 %
measured coverage= 91.63 %
measured coverage= 91.61 % 
measured coverage= 91.60 %

As we can see, the coverage is about 91.6%, which means that although we asked for a 5% probability of error, we are actually getting an 8.4% chance of error—a pretty significant difference. There are a couple of things going on. One problem is that the normal approximation is not a very good one. In particular, it is not recommended when np or n(1-p) is less than 10. Here, however, we pass this test. Other sources give a less conservative test where np and n(1-p) are only required to be 5 or more, which can lead to coverage as small as about 87.4% (for n=11, p= 0.455).

The unfortunate thing is that many stats classes, including the one I took many years ago, and also books such as the ubiquitous Art of Computer Systems Performance Analysis, do not go beyond presenting the normal approximation—I guess because it’s so simple. However, if you use this approximation and follow the np>5 rule, and aim for a 95% confidence interval, you could easily end up publishing results that are not even significant at the 90% level.

Many statisticians recommend avoiding the normal approximation altogether. But what should be used instead? The Clopper-Pearson interval is fully conservative: it is guaranteed to never have less coverage than you asked for. However, it is often overly conservative, so you might be getting 99% confidence when you only wanted 95%. What’s wrong with that? First, it may lead you to waste time running unnecessary experiments. Second, it may cause you to reject a hypothesis that actually holds at the desired level of confidence.

In my work for the last several years I’ve used the Wilson score interval. It is hardly any more difficult to implement than the normal approximation and has better behavior for small np. Also, it works for p=0 or p=1. The Agresti-Coull interval also has many good characteristics. Agresti and Coull’s paper from 1998 is short and easy to read for non-statisticians, I’d recommend it for people interested in learning more about these issues. A more exhaustive (and exhausting) discussion can be found in a 2008 paper by Pires and Amado.

Earlier I alluded to the fact that the normal approximation was just one problem with creating a confidence interval for a binomial proportion. The other problem—and it is a problem for all approximations, including the “exact” Clopper-Pearson interval—is that since the binomial distribution is discrete, it is not possible in general to make the actual coverage probability be the same as the nominal coverage probability.

The relationship between the nominal and actual coverage probabilities is interesting to observe. The following graph (produced using the C program above) shows the behavior of the normal approximation of the 95% confidence interval for n=50, between the np=5 and n(1-p)=5 bounds:

 

I wrote this piece because the difference between nominal and actual coverage probabilities, which is well-understood by statisticians, does not seem to have been communicated effectively to practicing scientists. In some ideal world the details wouldn’t matter very much because people like me would simply rely on a convenient R library to do all confidence interval computations, and this library would be designed and vetted by experts. In practice, a one-size-fits-all solution is probably not optimal and anyway I always seem to end up with confidence interval code embedded in various scripts—so I’ve spent a fair amount of time trying to learn the underlying issues, in order to avoid screwing up.

Summary: Ignore what you probably learned about confidence intervals for proportions in college. Use the Agresti-Coull interval or Wilson score interval.

28 Nov 17:49

The Root Kit Movie

by Alan Parekh

 

If you want to help a great movie get made you might want to through a few dollars towards The Root Kit movie by Jonathan Schiefer. I just recently heard about this on Frame Rate (go to 3:10 to see the interview about the movie) and the plot sounds great. They are going to make it on a shoestring budget as far as movies are concerned but based on Jonathan’s background it sounds like he can make it happen. There are lots of interesting levels of sponsorship, for example $4 will get your name in the credits, $64 will allow you to have some of your code to get distributed with the movie.  

 

26 Oct 11:00

Baumgartner Headshot, GIF’d

by René

Hab ich gestern schon getwittert, muss aber auch hier rein: „I hate soccer but this is awesome“. And it is.

23 Oct 18:31

Photo



17 Oct 14:02

Vectored I/O with mmap() to serve files

Previously, I’ve improved file serving performance in lwan by dramatially cutting down on the number of system calls performed to serve a file. However, for small files (< 16KiB), the throughput drop from the hello handler (which merely responds “Hello, World!”) was significant, since lwan was still performing system calls to open, obtain the size, and close the file.

I’ve experimented with userland caching before, but it never occurred to me to use mmap(). For the unitiated, this system call offers a way to map a file into memory, by giving a pointer to the process virtual memory space, that, when dereferenced, will perform the necessary disk I/O if the pages were not already present in the kernel buffers. Wikipedia has more details about it. Using mmap() greatly simplifies caching code by relaying it to the kernel, closer to where the low level buffers are.

By using a memory-mapped buffer and writev() (which the hello handler uses through lwan’s abstractions), the file serving performance improved about 60%! Before the optimization, weighttp would be able to make ~170000 requests/s. Now, ~286000 requests/s can be made. (That’s on my laptop, a Core i7 2640m, with 8GiB of RAM and without spinning platters.)

Of course, introducing caching also introduces a lot of complexity. Not only the file serving handler almost doubled its size (from 350 lines to 610 lines), but I’ve had to add a hash table implementation (with around 430 lines) and a directory watcher that uses inotify at around 150 lines of C code. In the order of 840 lines of code to improve performance by about 60%. About 30% more lines of code to improve performance in 60% – not bad, methinks.

On the other hand, the cache mechanism brings shared mutable state. This is protected by mutexes, of course, but I’m not sure if I got it right. One more reason to not use lwan in production.

As a bonus to these things, lwan now offers deflated content for the files in the cache when asked.

17 Oct 14:00

Tetrahedra of Space

by 50 Watts
22 pulp illustrations by Frank R. Paul (1884–1963) "Tetrahedra of Space," Wonder Stories cover, November 1931 I present here scans from expired Heritage Auction listings, following up on a October 2011 post of mid-20th-century science fiction and fantasy illustrations (Fantastic Plangent). I'll let two guys slightly more experienced in the field do the talking:"Paul remains the undisputed king of the pulp artists"—Arthur C. Clarke "Paul's fantastic covers for Amazing Stories changed my life forever."—Ray Bradbury From Wikipedia:Frank Rudolph Paul (April 18, 1884 - June 29, 1963) was an illustrator of US pulp magazines in the science fiction field. He was born in Vienna, Austria and died at his home in Teaneck, New Jersey. A discovery of Hugo Gernsback (himself an immigrant from Luxembourg), Frank R. Paul was influential in defining what both cover art and interior illustrations in the nascent science fiction pulps of the 1920s looked like. Two books of Paul's work (with the same material?): From the Pen of Paul: The Fantastic Images of Frank R. Paul and Frank R. Paul: Father of Science Fiction Art. All artwork (c) Frank R. Paul estate ("If you are interested in obtaining rights to reprint Frank R. Paul's for commercial purposes, please contact Bill Engle, FRP's grandson at billengle@newmex.com") original art for illustration, not dated Science Fiction illustration, not dated Wonder Stories cover, August 1930 "Apocalyptic New York," Wonder Stories cover Double-page Science Fiction story illustration "Lord of Tranerica," Dynamic Science Stories cover, February 1939 "The Robot Aliens," Wonder Stories illustration, February 1935 cover "The Robot Aliens," Wonder Stories illustration, February 1935 Science fiction illustration Amazing Stories, back cover, October 1945 "The 35th Millennium," Wonder Stories cover, August 1931 Science Fiction cover, June 1939 "Off to Space," Science Fiction Plus interior illustration, March 1953 "The Biological Revolt," Science Fiction Plus illustration, March 1953 "Stories of the Stars - Aldebaran," Fantastic Adventures back cover, December 1945 "West Point of Tomorrow," Thrilling Wonder Stories, September 1940 (I hope there's a contrarian out there still using the term "Scientifiction.") "The Planet Juggler," story illustration "As Mars Sees Us," 1940 Science Fiction Plus digest cover, December 1953 "Rockets and Men," interior story illustration "The End of the Moon," Science Fiction Plus back-cover, August 1953
06 Oct 00:18

The CryptoParty Handbook

by corbet
The first draft of the CryptoParty Handbook, a 390-page guide to maintaining privacy in the networked world, is available. "This book was written in the first 3 days of October 2012 at Studio Weise7, Berlin, surrounded by fine food and a lake of coffee amidst a veritable snake pit of cables. Approximately 20 people were involved in its creation, some more than others, some local and some far (Melbourne in particular)." It is available under the (still evolving) CC-BY-SA 4.0 license. The guide, too, is still evolving; it should probably be regarded the way one would look at early-stage cryptographic code. Naturally, the authors are looking for contributors to help make the next release better.
04 Oct 19:00

Poll: What features would you like to see added soonest in your favorite C++ compiler?

by Herb Sutter

I just got back from teaching a class, and I’m always amazed at the breadth and diversity of C++ developers. As Bjarne Stroustrup famously says: “No one knows ‘what most C++ developers do.’”

In particular, I’m surprised at how strongly some people feel about certain features, such as refactoring or safety or raw performance or exhaustive conformance, that don’t matter much at all to other people, and it made me wonder how a cross-section of the C++ developer community might prioritize them if I ran a poll.

So let me pose a question: Whether you use gcc or Clang or VC++ or Intel C++ or any other compiler, if you could only pick five of the following for your compiler to implement in its very next release, which would they be? Note that this is intended to be purely an “urgency” or ordering exercise, not an “importance” or eventually-will/won’t-do exercise — assume that your compiler vendor would eventually add everything possible from the list in their next few releases, and ask yourself only what would you just love to have in your hands the very soonest, right away?

(This is not intended to be a complete list, by the way, just picking some things I’ve again recently heard people mention, to validate how broadly interesting they really are.)

Thanks in advance for your participation. I’m really curious to see the results.

[Updated to add: Several people have asked to change the poll to add an option for "faster build times". I won't change the poll while in progress which would skew results, but for now if you want to vote for "faster builds" please post it as a comment and I'll manually add those comments up since that seems to be a popular request.]

Take Our Poll
Filed under: C++