Programming


External Links


cmake options

    cmake -D enable-qt=on -D enable-jpeg=on -D enable-double=on -DCMAKE_INSTALL_PREFIX:PATH=$PWD -DCMAKE_CXX_COMPILER=g++ -DCMAKE_CC_COMPILER=gcc

CPU tick counter

Use this as tick counter and see Wikipedia for more details.

 #ifdef __cplusplus
 #include <cstdint>
 #else
 #include <stdint.h>
 #endif

 __inline__ uint64_t rdtsc(void) {
 	uint32_t lo, hi; 
 	__asm__ __volatile__ (
 			"        xorl eax,eax \n"
 			"        cpuid"      // serialize
 			::: "rbx", "rdx");
 	/* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */
 	__asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
 	return (uint64_t)hi << 32 | lo;
 }

 __inline__ uint64_t rdtscp(void) {
  	uint32_t lo, hi;
 	__asm__ __volatile__("rdtscp" : "=a"(lo), "=d"(hi) :: "ecx" );
 	return (uint64_t)hi << 32 | lo;
 }

Source code line counter

Use ohcount to count source lines of your project.

valgrind

  valgrind --tool=cachegrind ./prg args
This simulates cache accesses/misses with cache sizes using information from the processor running the program.
  valgrind --tool=cachegrind --L2=8388608,8,64 ./prg args
Despite the real sizes of the caches valgrind simulates run with a 8MB L2 cache. with 8-way set associativity and 64 byte cache line size.
 cg_annotate cachegrind.out.<pid>
Use this to view the collected data. The Ir, Dr, and Dw columns show the total cache use, not cache misses, which are shown in the following two columns.
 Ir  : I cache reads
 I1mr: I1 cache read misses
 I2mr: L2 cache instruction readm  misses
 Dr  : D cache reads
 D1mr: D1 cache read misses
 D2mr: L2 cache data read misses
 Dw  : D cache writes
 D1mw: D1 cache write misses
 D2mw: L2 cache data write misses
 Bc  : conditional branches executed
 Bcm : Conditional branches mispredicted
 Bi  : Indirect branches executed
 Bim : Conditional branches mispredicted

Profile-guided optimization (PGO) with gcc

  1. Compile each source file with option -fprofile-generate
gcc emits a file with the extension .gcno for each input file with branch information
  1. Run a representative set of workloads
The run writes data into files with the extension .gcda into the source folder. The directory where the sources are has to be available and writeable. One output file is created for each input source file.
  1. Compile the final version using option -fprofile-use
  2. You can inspect the .gcno and .gcda files with help of gcov. Using gcov you get files with .gcov extension that contain branch counter, probabilites etc.

opcontrol as good tool for getting runtime properties

Note: If you have Ubuntu see next section for information about a prolem I had with Ubuntu 11.04.

The following infos are from the great paper "What every programmer should know about memory" from Ulrich Drepper.

To count the CPU cycles on x86 and x86-64 processors, one has to issue the following command:

  opcontrol --event CPU_CLK_UNHALTED:30000:0:1:1
30000: overrun number. Do not choose to small to avoid the system standstill. Lower values are more acurate but costs you an almost not reachable system during the run.
 opcontrol --list-events

 opcontrol --start
starts profiling
 opcontrol --stop
stops profiling
 opcontrol --dump
dumps information that the kernel collects to userlevel.
 oparchive
The data thus produced can be archived.
 opanotate
Use this to see where the various events happended. Counting CPU cycles will point out where the most time is spent (this includes cache misses).

Example: (run this as root)

 opcontrol -i cachebench
 opcontrol -e INST_RETIRED:6000:0:0:1 --start
Another option is CPU_CLK_UNHALTED.
 ./testprg
 opcontrol -h
 opreport
 opannotate --source

opcontrol error under Ubunt 11.04: "Make sure you are using the non-compressed image file"

  sudo apt-get install fakeroot kernel-wedge build-essential makedumpfile kernel-package
  sudo apt-get build-dep --no-install-recommends linux-image-$(uname -r)
  mkdir ~/src
  cd ~/src
  apt-get source linux-image-$(uname -r)
  cd linux*
  make oldconfig vmlinux
  sudo opcontrol --vmlinux=vmlinux
  sudo opcontrol --start

Link against gcc libraries that you have compiled on you system

 Libraries have been installed in:
/home/user/gcc-4.2.4/build/lib/../lib64
 If you ever happen to want to link against installed libraries
 in a given directory, LIBDIR, you must either use libtool, and
 specify the full pathname of the library, or use the `-LLIBDIR'
 flag during linking and do at least one of the following:
   - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
     during execution
   - add LIBDIR to the `LD_RUN_PATH' environment variable
     during linking
   - use the `-Wl,--rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

Dependencies of binary code

 nm --demangle

unit tests

 cppunit

gcc gives output about the whole compilation process

 gcc -fdump-tree-all -S -O ./test.c
in test.c.gimple is the intermediate representation of the code.

GIMPLE is a very simple representation of code. See here for more information.


set stack size for user on linux

To show stack size:

 $ulimit -s

To change stack size:

 $ulimit -s bytes

back to top

2-3 trees

In a 2-3-tree each inner node has at least two and at most three children. One children left, one right and maybe one in the middle. All leaves have the same depth and each leave represent a date with an unique key. The keys are defined from left to right in increasing order. Inner node are covered with the key of their right children.

depth: h (or height of the tree)

For a 2-3-tree with height h and n leaves is true: log_3(n) <= h <= log_2(n)

INSERT: We insert an entry after an unsuccessful search. That could lead to a four leaves node. We split these four children into two groups with two nodes. This leads to an increasing number of one of parent nodes. If the number of parent nodes are now also four we continue the splitting there as well. If we split the root we create a new root node which increases the height of the tree. Afterward we have to adjust the labels from bottom to top. We start at the new leave and run to the root node and change all occurence of labels which are lower than the key of the new node. The time to insert a date is O(h).

DELETE: We delete a node after a successful search. If this lead to a one leave part we do the following:

  1. STEALING: If the group of leaves which lie to the left or to right have three leaves in it then we can steal one. Steal the node that lies at the right end of the group that lies to the left of our one-node. Or you steal one node from the right side of our one-node while taking the node that lies at the left end in this group. Out of an one-node and an three-node we create two two-nodes.
  2. MERGE:If there is no three group neighbour we merge the one-node with a two-group in the neighborhood. Afterwards it can happen that the new parent node will be a one-node and the merging/stealing process will continue up to root.

After deleting the node we run from root to the deleted position and replace all occurence of the key of our deleted node with the key of the node that lies directly left of this node. The whole delete process will lead to O(h).

The operations INSERT, DELETE, SEARCH are O(log(n)) in a 2-3-tree

back to top

sed: filter output of the console with sed

 | sed s/\&lt\;?php\\\\\(.*\\\\\)?\&gt\;/\<?php\ \\\\1\ ?\>/g 
 cat guestbook.xml | sed s/\\\(siehe\\\)/\\1\\1/g | grep "kommentar "

is used in a php script that substitutes the php beginning parts.

back to top

XPath

Cross site scripting

Assume you have written a site on which you want to give the users a possibility to search your data storage. A normal formular where you can give a query and afterwards you search after it. Then as result you give the data which is founded but more important here is the answer of the given query the user had made. For example the user give instead of senseful text the following:

 %22%3E%3Cscript%3Ealert(666);%3C/script%3E

and you do not proof this string and give the answer the site containing for example as php script:

 You query was <?php print $_GET["query"]; ?>

and you will get a alert box saying 666 because the query contains html tag script which contains the alert statement. So be carefull with given data by the user.

gcc

 gcc -c main.c
compiles source to main.o
 gcc -o prg main.c file2.c
With option "o" the compiler recognizes that you want to compile and link the files together to the execution file with name prg.
 -l<library name>
 -L<path to library search path>
 -I<include path>
 -static
 -E
writes code on stdout
 -S
writes assembler program
 -W
 -Wall (all warnings)
 -Wtraditional (ansi standard warnings)
 -Werror (warnings become to errors)

 -ansi
 -std=cxx (-std=c99 for c standard from 1999)
 -pedantic

Praeprocessor options:

 -E  Stop after the preprocessing stage; do not run the compiler proper.  The output is in the form of preprocessed source code, which is sent to the standard output.
 -M 
shows dependencies list of source file.
 -C 
does not delete comments.
 GPROF (profiler)
 -pg
creates gmon.out which contains statistical data of the program run.
 -g
creates symbol table for debuging
 -ggdb3
increase amount of data.

Optimization:

 -On with -g for debuging
 -O0
no optimization
 -O1
 -02

 -ffast-math
 -finline_functions
 -fno-inline
 -funroll-loops

make

parallel build

      make -j 16

replacements in variables

 SOURCES = foo.c bar.c
 TARGET: 
     @echo $(SOURCES:.c=.o)
     @echo $(SOURCES:%.c=%.o)
     @echo $(SOURCES:foo=replaced) # does not work

Filename is Makefile, comments begin with #, the first character of a command line has to be a tab character

 make target
calls commands for target "target"
 make
calls commands for the first target
 # target debug
 debug: main.c test1.o test2.o
     gcc -ggdb3 -o prg main.c test1.o test2.o

variables

 test: main.o test1.o test2.o 
     gcc -o $@ main.o test1.o test2.o -lm
"$@" stands for the target name here "test". Use "$*" for the target name without suffix if there is one.
 $@     target name 
 $*     target name without suffix
 $<     first dependence of rule
 $^     all dependencies
 $?     all dependencies that are newer as target
 $(@D)  directory name of $@
 $(@F)  only filename without path for $@
 CURDIR current working directory

replacements in variables

 $(NAME:%.c=%.o)

oder

 $(NAME:.c=.o)

 objects = foo.o bar.o
 all: $(objects)

 $(objects) : %.o : %.c
      $(CC) -c $(CFLAGS) $< -o $@

 CC = gcc
 OBJECTS = MAIN.O
 LIBS = -LM
 CFLAGS = -C
 test: $(OBJECTS)
     $(CC) -o $@ $(OBJECTS) $(LIBS)

implicite rules

 # delete existing rules
 .SUFFIXES:
 #define rule
 .SUFFIXES: .c .o
 #rule
 .c.o:
     command
creates a .o file from a .c file with command.

patterns

 %.o: %.c
     rule
use rule for target with arbitrary filename and .o suffix and dependencies with suffix .c

Creating of libraries

static libraries

 /* myint.h */
 #ifndef _MYINT_H
 #define _MYINT_H
 typedef int MYINT;
 MYINT myint_equal(MYINT a, MYINT b);
 #endif 

 /* myint.c */
 #include "myint.h"

 MYINT myint_equal(MYINT a, MYINT b)
 {
   if(a==b) return 1; else return 0;
 }
  1. To make a static library from this tiny file you have to build an object file first:
 gcc -c myint.c
  1. Now we create an archive file .a with the command ar (archiver):
 ar crs libmyint.a myint.o
  1. create index entries
 ranlib libmyint.a
  1. install library

Copy libmyint.a to /usr/lib or /usr/local/lib and the header file myint.h to /usr/include or /usr/local/include.

  1. use the library
 gcc -o prg -lmyint -L/mypath2myint/blablub

or

 gcc -o prg prg.o libfolder/myint.a

dynamic libraries

Syntax of the naming process is as follows:

 lib(name).so.(main version number)(minor version number)(release number)

 ldconfig
creates a symbolic link with the name of the dynamic library that points to the library with the main version number.
 ldconfig -p
lists all installed symbolic links to libraries.
  1. create object file
 gcc -fPIC -Wall -g -c myint.c

 -fPIC
creates position independent code.
 gcc -ggdb3 -shared -W1,-soname,libmyint.so.1 -o libmyint.so.1.0 myint.o -lc

 -shared
creates dynamic lib
 -W1
gives parameter through to linker ld
 -soname
gives the full name of the library with version number.

You should link the c library (-lc) with your own library.

  1. install lib
 cp libmyint.so.1.0 /usr/lib
 cd /usr/lib
 ln -fs libmyint.so.1.0 libmyint.so.1
 ln -fs libmyint.so.1 libmyint.so
 ldconfig

If ldconfig creates also symbolic links you can rid of the third and fourth step.

  1. use of the lib
 gcc -o prg test.c -lmyint
dynamic load of libraries
 #include <dlfcn.h>
 void *dlopen(const char *filename, int flags);
 const char *dlerror(void);
 void *dlsym(void *handle, const char *symbol);
 int dlclose(void *handle);

 gcc -o prg -ld1
to load libd1.a respectively libd1.so

Run-Time measurement

TIME

or /usr/time

 $/usr/time gtk4
 real    0m3.052s
 user    0m0.100s
 sys     0m0.020s

real

Complete run-time from the beginning to the end.

user

Time that has the process been in the user level.

sys

Time that the process has been in the kernel level.

Not very accurate because of other processes which runs simultaneously.

GPROF

 $sudo apt-get install qprof

To measure specific functions you cannot use /usr/time.

To measure theses values you have to create a profile for GPROF. To do this with gcc use option -pg:

 $gcc -pg -o prg prgsrc.c func.c 
 $./prg
This creates a file with name gmon.out.
 $gprof ./prg > test_profile.txt
Now we can call gprof with the program. It opens the file gmon.out and analyze the program what it gets through parameter.

Then you get the flat profile and the call graph of the program.

GCOV

"gcov" is installed with the gcc compiler package.

To count how often specific program lines were executed use gcov.

 $gcc -ggdb3 -fprofile-arcs -ftest-coverage -o prg prgsrc.c func.c
creates for every source file a .bgg file which contain the control flow graph and map it to the source code.
 $./prg
creates file with extension .da
 $gcov prgsrc.c
 $gcov func.c
gives summary how often the lines were executed. For further analysis watch the file pgr.c.gcov.

STRACE - Tracing system calls

 sudo apt-get install strace

To research how the connection of a program with the kernel is you can use strace.

 gcc -o prg prgsrc.c
 strace prg
 strace -o prg.log prg
writes output to prg.log
 strace -f -o 1.log prg
follows child processes.
 strace -p pid
follows system calls of process with pid.
 strace -e trace=open,write prg
gives only calls of open or write to output.

Special filter for a group of system calls:

 strace -e trace=file
 strace -e trace=process
 strace -e trace=network

Overflow on memory heap and memory leaks

efence

 gcc -o prg prgsrc.c -ggdb -lefence
compiles and link together with libefence

Then use gdb to find the position of the program crash.

 run
 bt

valgrind

It does not need any library. Use the tool with

 valgrind ./prg

Besides "Invalid read of size" and "Invalid write of size" you get a summary of allocations and deallocations. Therefore you can easily see if there is a memory leak.

With use of --leak-check=yes you analyze especially for leaks:

 valgrind --leak-check=yes ./prg

back to top

PHP and XML

domxml is not working well by default or even not installed. So I try to work with extpat that is installed by default. But there I have many problems with node contents and different character like utf-8 and iso-8859-1.

 $parser_object = xml_parser_create();
 xml_set_element_handler($parser_object, "startElement", "endElement");
 if(!($fp=fopen("name.xml","r"))) die("cannot open xml file");
 while($data=fread($fp,4096)) xml_parse($parser_object, $data,feof($fp));

 function startElement($parser_object, $elementname, $attribute)
 {
 print "startelement";
 }

 function endElement($parser, $elementname)
 {
 print "endelement";
 }

colors on console

Small script to show effect:

 printf "\e[1;31mTEST\e[0;0m\n"
 printf "\e[1;32mTEST\e[0;0m\n"
 printf "\e[1;33mTEST\e[0;0m\n"
 printf "\e[1;34mTEST\e[0;0m\n"
 printf "\e[1;35mTEST\e[0;0m\n"
 printf "\e[1;36mTEST\e[0;0m\n"
 printf "\e[1;37mTEST\e[0;0m\n"
 printf "\e[1;38mTEST\e[0;0m\n"
 printf "\e[1;39mTEST\e[0;0m\n"
 printf "\e[1;40mTEST\e[0;0m\n"
 printf "\e[1;41mTEST\e[0;0m\n"
 printf "\e[1;42mTEST\e[0;0m\n"
 printf "\e[1;43mTEST\e[0;0m\n"



  Text color codes:
  30=black 31=red 32=green 33=yellow 34=blue 35=magenta 36=cyan 37=white
  Background color codes:
  40=black 41=red 42=green 43=yellow 44=blue 45=magenta 46=cyan 47=white

I tried it with but not with the below code but with the following:

 cerr << "\e[1;31mATTENTION\e[0;0m" << endl;

That goes well and without further include dependencies.

In python:

 from sys import stdout
 if stdout.isatty():
    print "^[[32;1mtty^[[0m"
 else:
    print "notty"

In C:

 #include<unistd.h>
 if(isatty(fileno(stdout))) printf("^[[32;1mtty^[[0m\n"); else printf("notty\n");

 00 Normal
 01 fett
 04 unterstrichen
 05 blinkend
 07 vorder und hintergrund vertauscht
 22 normal wieder herstellen
 30 schwarz
 31 rot
 .
 .
 47 weiss
 49 voreingestellter hintergrund

iconv and recode

To convert a file from one coding sceme to another you can use iconv or recode.

 iconv -f UTF-8 utf8.txt -t latin1 -o iso8859-1.txt
 recode utf8..latin9 < utf8.txt > iso8859.txt

exec commands

 #include<unistd.h>
 int execl(...)
 int execlp()
 int execle()
 int execv()
 int execvp()
 int execve()

 e means it gets an enviroment variable as vector
 l means arguments are given as list
 v means arguments are given as vector
 p means that a filename will be expected not a path

win32 programming tools

 editbin myprg.exe /STACK:0x2000000
sets the stack size up to 32mb
 dumpbin /ALL /RAWDATA:NONE myprg.exe
shows information about the exe file.

Static Libraries

What they are, why they're useful.

An archive is a single file holding a collection of other files in a structure that makes it possible to retrieve the original individual files (called members of the archive). A static library is an archive whose members are object files. A library makes it possible for a program to use common routines without the administrative overhead of maintaining their source code, or the processing overhead of compiling them each time the program is compiled.

How they're named, version numbers, symbolic links.

Static libraries have names like libfoo.a, with no version numbers. There are normally no symbolic links associated with a static library.

Debugging symbols.

?

Where they're installed.

Static libraries are installed in /usr/lib or /usr/local/lib, not in /lib. They are used only during compilation. They are not critical to the operation of the system, so need not be on the root filesystem.

How ld finds them.

ld looks for the C library libc automatically. Otherwise, the option -lfoo will cause it to search its path-list for the library libfoo.a. ld looks for libraries in standard directories, plus those specified by -Ldir options on its command line.

What nm can tell you.

nm can list the symbols defined in a library. For example, "$ nm --print-file-name /usr/lib/*.a|grep cbrt" will show that the symbols containing the string "cbrt" are in libm.a (the math library).

Example commands to build a static library.

 ar rs libfoo.a foo1.o foo2.o

Example commands to compile and link using a local static library.

 gcc -I. -o jvct jvct.c libjvc.a

Example commands to install a static library.

 install -m 644 libjvc.a /usr/lib

Example commands to compile and link using an installed static library.

 gcc --static -I. -o jvct jvct.c -ljvc

Shared Libraries.

What they are, why they're useful.

Each program using routines from a static library has a copy of them in the executable file. This wastes disk space and (if more than one such program is in use at the same time) RAM as well. Also, when a static library is updated, all the programs using it must be recompiled in order to take advantage of the new code. When a program uses a shared library instead, the program binary does not include a copy of the code, but only a reference to the library. The run time loader, ld.so, finds the library and loads it into memory at the same time as the program.

How they're named, version numbers, symbolic links.

The run time loader finds the library by its "soname" which includes only the major version number (for example, "libfoo.so.1"). Therefore, a new version of the library can be installed, and existing programs will use it automatically. Of course, it is critical to change the major version number if calling sequences change in an incompatible way. Several libraries with different major version numbers can be installed at once, and in fact need to be, until all programs using the library have been recompiled. ldconfig creates a symbolic link with the soname pointing to the current version of the shared library. There should also be a symbolic link with no version number (for example, "libfoo.so") which is used at compile time to find the current version. libfoo.sa: what "exported initialized library data" means.

What flags to compile with.

Debugging symbols.

Where they're installed.

The shared library ("libfoo.so.1.0.1") and the symbolic link with only the major version number ("libfoo.so.1") should be in /lib if the library is required by any program in /bin or /sbin. The link with no version number ("libfoo.so") is used only at compile time. It should be in /usr/lib.

How ld, ldconfig, ldd, and ld.so find them, and /etc/ld.so.conf.

At compile time, ld searches for shared libraries in standard directories and in directories added with -rpath dir and -Ldir command line options. During system boot, or when run manually (after updating libraries), ldconfig searches in standard places plus directories specified in /etc/ld.so.conf, and saves references to the libraries it finds in a cache. ld.so (at program run time) and ldd (as requested) look in directories specified by the -rpath option given at compile time, or in the cache built by ldconfig.

What "ldconfig -p", "ldconfig -D", nm and ldd can tell you.

"ldconfig -p" will display the contents of the cache. "ldconfig -D" will search for shared libraries and display them. "nm --dynamic /usr/lib/libfoo.so.X" will display the symbols defined by the library. "ldd foo" will display the shared libraries required by foo, and where they were found.

Example commands to build a shared library.

 gcc -shared -Wl,-soname,libjvc.so.1 -o libjvc.so.1.0.1 jvc.o
 ln -sf libjvc.so.1.0.1 libjvc.so.1.0
 ln -sf libjvc.so.1.0 libjvc.so.1
 ln -sf libjvc.so.1 libjvc.so

Example commands to compile and link using a local shared library.

 gcc -I. -o jvct jvct.c -ljvc -L. -Wl,-rpath,`pwd`

[To alter the search path for shared libraries without using -rpath, the ld man page suggests setting LD_RUN_PATH, and the ld.so man page says you set LD_AOUT_LIBRARY_PATH, but I could not get either one to work.]

Example commands to install a shared library.

 cp libjvc.so.1.0.1 /lib
 (cd /usr/lib; ln -sf ../../libjvc.so.1.0.1 libjvc.so)
 ldconfig

Example commands to compile and link using an installed shared library.

 gcc -I. -o jvct jvct.c -ljvc