Your Style ? Stylish presentations without Powerpoint or Keynote

11 11 2011

For the last few days I’ve been investigating possibilities to create presentations that will output a mixture of HTML / CSS / JS and are created by either using plain HTML or some kind of markup language. The general idea behind this approach is that for 90% of the presentation you create you will follow a simple set of styles and you will not need some special visualizations.

What is most interesting for me is to use my standard work environment, like my editor of choice and leave the styling to a predefined set of markups, probably following some corporate identity guidelines.

Long story short, I’m disappointed with what I’ve seen. Basically there are two reasons why everything I looked at falls short of my requirements:

  • PDF generation – I want to generate a PDF of my slides that I can easily distribute (static HTML does not do the trick, publishing slides to Heruko or Github neither)
  • Templates – I have no idea why non of the tools I looked at provides simple support for templates. Is CI something that was lost in the whirlwinds of Web 2.0? And by templates I mean more than putting a logo at the bottom right.

And here are the tools I observed with their shortcomings:

  • Showoff – Markdown for the slides is perfect, templating impossible and PDF generation fails for images defined by CSS backgrounds ( I might have a patch for that)
  • Deck.js – HTML is just to noisy to work on slides, print markup does not look like the original slide
  • Slippy – HTML is to noisy
  • LaTeX Beamer – LaTeX as markup is fine for me, PDF output is perfect, templating is a pain in the …

In general the markup based solutions are very easy to setup and use, but typically fall short to allow fine-grained customization of the layout. The simplest example is that there is typically no way to place the slide title and the slide body separately.

As a result it looks like I’m back to Powerpoint.





C++11 / C++0x Documentation

1 11 2011

You want to improve your “old” C++ code base by allowing new features or bug fixes to be enhanced with C++11? The questions you will face on the way are manifold. First, what exactly allows me C++11 to do? Where is some documentation? What is the purpose of those features. Compared to languages like Ruby or Python the C++ standard library and language itself is not too well documented online.

As a preparation for a lecture I’m holding in the next semester I was compiling a list of links to C++11 / C++0x documentation sites and as some kind of personal archive I will post them here:

General C++11 / C++0x FAQ

New Features (Overview)

Compiler Support

Standard Library Documentation

I will update this list as I find new sources. If you have any more links you think that are missing, please mention them in the comments and I will add them.
-Martin




Easy parsing of sysfs to get CPU topology data using libsystopo

20 01 2011

When you are working with main memory it is crucial to make sure that all the data structures are correctly sized and aligned. A typical approach is to create blocks of data that are processed independently. For the developer, the question is how large such blocks should be? The answer is that those blocks should always be either cache sized.

Now, how large is the cache on the system you are using? Either you can go for experiments detecting the different cache levels and the cache line sizes, or if you are happy to have a Intel Linux system at hand, to simply explore the information as it is stored in the sysfs filesystem exported by the Kernel.

However, parsing this information at development time might be ok, at run-time the best way is to adjust the system settings based on the actual configuration. At the moment, libudev does not support to read the information, so your off to yourself.

To avoid that everybody writes the same code I took some time to write a small library that reads the information about the CPU caches and the CPU topology and allows to easily process this information in your program. You can find the most current version of the code as usual at Github.

#include <systopo.h>
using namespace systopo;

int main(void)
{
    System s = getSystemTopology();
    return 0;
}

The System definition parses all data from sysfs and can be reviewed here:

    struct Cache
    {
        size_t coherency_line_size;
        size_t level;
        size_t number_of_sets;
        size_t physical_line_partition;
        size_t size;
        size_t ways_of_associativity;
    
        std::string type;
    
    };

    struct Topology
    {
        size_t core_id;
        size_t physical_package_id;
        std::vector<size_t> core_siblings;
        std::vector<size_t> thread_siblings;
    };
    
    struct CPU
    {
        std::vector<Cache> caches;
        Topology topology;
    };


    struct System
    {
        std::vector<CPU> cpus;
        std::vector<size_t> online_cpus;
        std::vector<size_t> offline_cpus;
    };

For more information about the meaning of the Topology please refer to the Kernel documentation. The meaning for the CPU cache fields should be clear or refer to “What every programmer should know about memory”.

If you have feedback, comments or ideas, I’m glad to respond!

– Martin





Emacs DBLP Mode for better Papers in LaTeX

25 10 2010

Do you remember the last time you were writing a paper and you knew that you reached the perfect point for a citation. You knew the author or the paper but you did not want to lookup the BibTeX entry or even create it? So you started to use rDBLP, but now you have to lookup the citation key every time again and again. Since your BibTeX file is now build after the paper was compiled the first time, there is no chance to use common BibTeX management tools.

As a consequence from this problem I wrote a small minor mode for Emacs that allows to search the DBLP database directly from Emacs and insert the correct citation key.

To install this minor mode follow these steps:

  1. Go to your local site lisp directory – e.g. ~/.emacs.d/elisp
  2. git clone git@github.com:grundprinzip/dblp.el.git
  3. and now add the following lines to your Emacs configuration to activate the minor mode as soon as you enter LaTeX mode
;; DBLP mode
(add-to-list 'load-path "~/.emacs.d/elisp/dblp.el")
(require 'dblp)
(add-hook 'LaTeX-mode-hook 'dblp-mode)

If you want to use the querying hit “C-M-c” if the DBLP mode is activated and this will start an interactive mode to query DBLP.

Currently the minor mode requires Ruby to be available on the platform. I plan to port the parser and querying to Lisp but currently it’s easier for me to write it in Ruby. Do you have any comments or questions, please leave me a message in the comments.

-Martin





rDBLP version 0.4.5 released

11 10 2010

DBLP version 0.4.5 is more or less a maintenance release. It fixes several connection problems and as an improvement removes the cross-reference from the BibTeX entry. The book title in the main entry is now replaced by the title of the cross-reference. This should be more convenient with regards to the limited space in a research paper.

To update your gem execute

sudo gem install dblp

For more detail see my original post.

-Martin





Emacs Movement Shortcuts Wallpaper

9 10 2010

Stumbling across the Vim Movements Shortcuts Wallpaper, I decided to make one for Emacs. Basically I really like the idea of having a real quick reference to the most important shortcuts on your desktop.

Emacs Wallpaper

If you have any additional shortcuts that you would love to have, please leave a comment. If you want to modify the wallpaper, I uploaded the original PDF version of the wallpaper that you can easily modify with any tool of your choice.

-Martin





Switch, case, typelists and type_switch

31 08 2010

Whenever you are building a system that has it’s own type system you will come to a point where you perform type dependent operations. If your types seamlessly map to standard integral types most of the mapping code and extraction can be handled by simple template methods, but from time to time you will find the following code fragments:

switch(type)
{
case IntegerType: /**/
do_something_important<int>(value);
break;
case DoubleType: /**/
do_something_important<double>(value);
break;
}

Interestingly the only difference in the above code line is only the requested type. A concrete example is e.g. hashing of a certain value. The type of the value is stored in a variable and depending on the actual type different hash functions have to be called. When you find something like this the first time, you will feel ok, the second time a little more nervous and the third time…

The question is now: How can I rewrite my code so that it will be less explicit and most important easier to extend. The biggest problem with the above solution is that once you extend your type system the whole code will be changed and there must be an easier way to solve this.

One of my first approaches was macro magic to iterate over a sequence and than generate the right code by text expansion. However this will not work out due to the fact that macros in C++ are not recursive and will not be called a second time. Reading in “Modern C++ Design” by Alexandrescu — a must read — I stumbled upon typelists (well I had some support by @bastih01). After one evening screwing my had around them finally I made some progress.

The solution I found is based on static recursive template generation plus dynamic type switching at runtime. The reason for this rather complicated approach is the following: while the general type information is available at compile-time, the explicit instance related type information can only be mapped using an enum at run-time.

Enough words lost, whats the solution?

Consider the following setup: First we define the type list and the enum for storing the type information.


#include <boost/mpl/vector.hpp>

typedef enum
{
IntegerType,
FloatType,
StringType
} DataType

typedef boost::mpl::vector<int, float, std::string> basic_types;

Now we need to implement our type_switch operator, basically it is based on TinyTL but used in a Boost environment, because I did not find anything alike in Boost directly.

template <typename L, int N=0, bool Stop=(N==boost::mpl::size<hyrise_basic_types>::value)> struct type_switch;
        
template <typename L, int N, bool Stop>
struct type_switch
{
    template<class F>
    typename F::value_type operator()(size_t i, F& f)
    {
        if (i == N)
        {
            return f.operator()<typename boost::mpl::at_c< hyrise_basic_types, N>::type>();
        } else {
            type_switch<L, N+1> next;
            return next(i,f);
       }
    }
};

template <typename L, int N>
struct type_switch<L, N, true>
{
    template<class F>
    typename F::value_type operator()(size_t i, F& f)
    {
         throw std::runtime_error("Type does not exist");
     }
};

If you look at the above code for the first time it is kind of weird to understand what is going on. But once you understand template recursion it’s totally clear. But easy things first: boost::mpl::size defines a template that contains the size of the typelist hyrise_basic_types. The boost::mpl::at_c template defines an random accessor to the typlist based on a constant index.

The template type_switch is a special construct with 3 parameters, the first is the type list, the second is the current position in this list defaulting to 0, and the third is a boolean parameter determining if the recursion should stop. The default implementation of this struct with the operator() method checks if the current value i is equal to N and if this is the case calls the operator() method on the function object submitted as a parameter. If this is not the case it instantiates a new template and increases N by 1. This is possible because all int values for the complete list are known in advance at compile time and so they can be used as template parameters. To avoid infinite recursion a dedicated template specialization with Stop=true provides an implementation that should never be called and does not further invoke any template recursion.

But back to the functor used in this setting. We have one requirement on the functor and this is that we have to specify the value_type of the operator() method directly in the functor. A sample implementation based on boost hash could look like the following.

template<typename T>
struct hash_functor
{
    typedef T value_type;
    AbstractTable* table;
    size_t f;
    ValueId vid;

     hash_functor(): table(0) {}

     hash_functor(AbstractTable * t, size_t f, ValueId v): table(t), f(f), vid(v) {}
            
     template<typename R>
     T operator()()
     {
         return boost::hash<R>()(table->getValueForValueId<R>(f, vid));
     }
};

For this functor, T defines the return type of the functor with an valye_type typedef and R is the type of the actual type used for the type_swtich. Instead of the clustered switch case statement the code for my type depended hash value method looks a lot better.

size_t hash_value(AbstractTable * source, size_t f, ValueId vid)
{
    hash_functor<size_t> fun(source, f, vid);
    type_switch<hyrise_basic_types> ts;
    return ts(source->typeOfColumn(f), fun);
}

The last sentence goes to the cost of this access: Each level of hierarchy generates at least one method call plus an evaluation of an if statement. The generic switch/case only generates the comparison, but I think this overhead is neglectable compared to the huge amount of time saved when it comes to extending the usable data types.





rDBLP — Easy BibTeX Management for your Research Paper

31 08 2010

Maintain your LaTex bibliography files using DBLP

It’s always the same, you write a paper / thesis and you are searching for source, than you modify them until they fit into the format you are currently using. But this should not be. A better way is to extract the right key from DBLP and automagically create the correct bib file. You think this is magic? No it’s not, it’s so easy.

There are only two prerequisites for using rDBLP on your computer:

  1. Make sure Ruby and RubyGems are installed
  2. Make sure you have a working internet connection

To install the gem execute this from you command line:

sudo gem install dblp

Once this is done you can use it directly for any LaTex file. Imagine you have LaTex file containing somewhere the following citation.

The entity shaping used in web services as discussed in \cite{DBLP:conf/IEEEscc/GrundKZ08}...

\bibliographystyle{abbrv}
\bibliography{dblp}

When you run now the dblp command in your terminal, the program will read the auxiliary files from the compilation and extract the requested DBLP citation sources. Than it will download them and store them directly in a file called “dblp.bib” which you can use in your LaTex document.

dblp my_file

To make it really easy, here is a screencast I did for my small little tool:

If you have any questions, feel free to ask via mail or contact me using Twitter.





Relational DBMS vs Document Oriented Key-Value Stores?!?!

5 06 2010

In the recent days document databases get big buzz from all different areas: programmers who search for freedom or VCs who search for the next big buck. But there is a big disadvantage: the bigger the noise is the less the people using buzzwords like cloud or key-value store, understand whats behind. So let me start with a short disclaimer: I am not against key-value stores or any other document oriented database!

But nonetheless I think it is required to clarify what a standard relational database can do, especially how powerful a language like SQL is. So lets clarify this by some examples and let me give you my take on the buzz. To make it more entertaining consider this a small Q&A session with a key-value store guy and me as the defender of relational DBMS.

“When I’m working with documents I just dump everything into the object and can map-reduce everything! SQL cant do this.”
Sure you can dump anything in an object, but SQL has a different heritage: SQL is a declarative language that is designed to express the What instead of the How. If you want to express algorithms directly on the database use stored procedures!

“But stored procedures are bad, because they are not portable!”
Ah get that, but how about your custom map-reduce code? Writing custom code for querying data is nothing more than using the data store API directly and is a stored procedure.

“In SQL I can’t create my own functions, I’m limited to MIN,MAX,AVG. That’s so bad!”
Whew, I hope that you did not stop reading about SQL in ’89 or ’92. The SQL standard provides everything you need to define user-defined functions and custom functions. Any major DBMS allows you to even write them in any programming language you can program.

“You know, I want to be flexible, stream content of different types. SQL cant do this!”
Maybe you are missing a point here that is called normalization. Yes, SQL is restricted to flat tables but you can use normalization and queries to create any list of your content.

“But SQL is so slow, every time I create a join I’m lost. Look, XXX-DB is so fast using static indices once I created a map function!”
Dude, using and index to increase query performance is as old as I am. You want indices, go create them! If you don’t know what queries you use, you wrote a bad application. For any other case you might go for index optimization techniques well researched in the last decades.

“But it runs in the cloud! This makes it even more faster and scalable.”
Gnargh, I’m pretty sure you know about all this parallel database research going on since — decades. Just because it is a key-value store does not make it scalable and faster. Please dig deeper.

To conclude, please understand me I’m not against key-value stores or document databases, they have their very specific terrain — as do standard relational databases. So what might be a good example for a key-value store document oriented application? When looking at the characteristics of such databases one thing that comes to my mind immediately is the following: Since every object is stored by a unique key all those applications benefit where the workload is almost only single object lookup and no aggregations take place. Since hash functions that are used to build the indices for the keys are ideal for partitioning scale-out can be implemented easier. Another important fact is the ability for semi-structured data, here document oriented storage system clearly win over relational systems.

The only thing I wish from people propagating document oriented storage is that they do not only talk about it because its cool but because they can sketch a valid use case. If you feel like you don’t know enough about the possibilities of standard relational databases talk to people who do or read something about it.

Did I offend you? Challenge me, let’s discuss!
– Martin





[slimtimercli] Version 0.1.8 published

5 02 2010

It’s been while since I worked on slimtimercli but today I received a pull request via Github telling me that evaryont polished the gem and updated it and fixed some bugs. I took a few minutes to make slimtimercli use gemcutter and published it again — with version 0.1.8.

I love open source software!

– Martin