C/C++ trick: static string hash generation

I am always interested in having the compiler do more things for me, without giving away code clarity or performance for the convenience. Today a colleague linked me to Pope Kim's Compile-Time Hash String Generation article which is a perfect example of the things I like: hidden syntactic sugar that does useful things.

Inline hash function

The goal: for a given hash function, write something like HASH_STRING("funny_bone") in the code, and have the compiler directly replace it with the result, 0xf1c6fd7f.

The solution: inline the function and hope that the compiler will be clever enough.

#include <string.h>
#define HASH(str) generateHash(str, strlen(str))

inline unsigned int generateHash(const char *string, size_t len)
{
    unsigned int hash = 0;
    for(size_t i = 0; i < len; ++i)
        hash = 65599 * hash + string[i];
    return hash ^ (hash >> 16);
}

Unfortunately Pope ran into several very problematic issues:

  • requires heavy optimisation flags (/O2 with Visual Studio, -O3 with g++)
  • limited to 10-character strings with Visual Studio
  • limited to 17-character strings with g++

I could personally reproduce the g++ limitations. I believe they are more related to loop unrolling limits than to the actual string size, but they indeed make the technique unusable in practice.

Macro-based hash function

If you read my previous article about C/C++ preprocessor LUT generation, you may remember that it used preprocessor tricks to do loop unrolling.

Hence the following implementation:

#include <string.h>
#include <stdint.h>
#include <stdio.h>

#define H1(s,i,x)   (x*65599u+(uint8_t)s[(i)<strlen(s)?strlen(s)-1-(i):strlen(s)])
#define H4(s,i,x)   H1(s,i,H1(s,i+1,H1(s,i+2,H1(s,i+3,x))))
#define H16(s,i,x)  H4(s,i,H4(s,i+4,H4(s,i+8,H4(s,i+12,x))))
#define H64(s,i,x)  H16(s,i,H16(s,i+16,H16(s,i+32,H16(s,i+48,x))))
#define H256(s,i,x) H64(s,i,H64(s,i+64,H64(s,i+128,H64(s,i+192,x))))

#define HASH(s)    ((uint32_t)(H256(s,0,0)^(H256(s,0,0)>>16)))

It has the following properties:

  • works in C in addition to C++
  • strings are always optimised away by gcc or g++ (but not always the computation itself)
  • hash computation is optimised away by gcc or g++ even with -O, -O1 or -Os
  • string size limit is 256 characters (probably more than enough for most uses) and can be manually increased or decreased

The following code:

int main(void)
{
    printf("%08x\n", HASH("funny_bone"));
    printf("%08x\n", HASH("incredibly_large_string_that_gcc_groks_easily"));
}

Is (correctly) optimised to this with gcc -Os:

  ...
  movl    $-238617217, %esi
  movl    $.LC0, %edi
  xorl    %eax, %eax
  call    printf
  movl    $-453669173, %esi
  movl    $.LC0, %edi
  xorl    %eax, %eax
  call    printf
  ...

I haven't tested it with Visual Studio. Feedback from this compiler would be very appreciated!

  • Posted: 2012-01-12 18:05 (Updated: 2012-01-12 18:07)
  • Author: sam
  • Categories: code tip

Comments

1. Pope -- 2012-01-12 18:53

Sorry i stole your thunder. But I am jealous you got a very cool blog address.. :D Check this out too. (see the history to see humus' way to make it work for even longer chars.. I tested up to 64 chars: all works great both with VS2010 and g++)

http://twitter.com//BlindRenderer/status/157519159180263424

There are few usability issues especially a buffer like this: const char constBuffer[65] = "only3"; I'm trying to see if there's a walkaround :)

Also VS 2010 is not smart enough to know these two are same, thus not flattening down the first one with the Humus way(g++ is smart enough :D):

#1

const char * const constBuffer = "rawr";
unsigned int hashValue = HASH_STRING_CONST(constBuffer);

#2

unsigned int hashValue = HASH_STRING_CONST("rawr");
2. jj@notsharingmy.info -- 2012-01-12 19:20

I'm always enjoying your articles. You may have already seen this, but if not, this will be of interest to you. http://altdevblogaday.com/2011/10/27/quasi-compile-time-string-hashing/

4. pope -- 2012-01-12 19:50

sorry it did double posting.. i dunno why. it said the capcha thinks i'm spam so i did it.... and apparently the first one was there.. sorry delete once please :)

5. sam -- 2012-01-12 21:08

@Pope: that’s because I manually accepted your answer in the meantime; we must have been only a few seconds away :)

Thank you for the very interesting link. Using the fact that char foo[5] and char foo[6] are actually different arguments is very clever.

As for the const char constBuffer[65] = "only3"; case, I think it can be solved by slightly modifying the hash function so that it ignores characters after the first \0.

@jj: thanks! That’s a really superior implementation. If one doesn’t care about C support, that’s really the way to go.

6. yzk0370@gmail.com -- 2013-06-17 19:13

Hi, I am working on the similar things on vs2012. I have made the msvc compiler to put a hashed number directly as an asm operand just as your example. However, I still could find the plain text in the executable binary. I suppose the plain text should no longer be useful because it has already been hashed and replaced. Do you have any idea to omit the plain text from the binary totally?

7. Anonymous -- 2015-03-13 18:32

Nice write up. I'm having trouble using the HASH to initialize a static look up table. I looked over your LUT article, and this should be able to handle it.

<SNIP>
uint32_t hashes[] = {

HASH("string1"),
HASH("string2"),
HASH("string3"),

};

results in errors: initializer element is not constant ... for all.

I've tried with gcc 4.8 & 4.6 with all three -O options.

Any thing I'm doing wrong? Thoughts? Changes to gcc that might have broken this?

Thanks in advance.

8. Lukas -- 2015-11-05 10:00

I take your idea as an inspiration for creating a fully static HASH function (by taking out the string length functions to). Have a look at http://www.heeden.nl/statichashc.htm This even allow you to use the HASH value in a switch-case construction on a strict ANSI C compiler.

9. anonymous -- 2016-07-07 20:42

If the string are same, but one was constant and generated in compile time, another generated via a variable. they have different value. Why?

296. subanal -- 2016-11-27 02:08

yEDI5X I truly appreciate this post. I have been looking all over for this! Thank God I found it on Google. You have made my day! Thanks again..

337. mike tyson -- 2017-03-04 14:50

5fjnQy wow, awesome article post.Really looking forward to read more. Great.

338. anonymous -- 2017-11-11 06:28

Thanks a lot for the post. http://mysnaphack.com

339. mike -- 2018-01-15 14:00

You have posted very nice tutorial which i like reading. I must say you have made a great effort in making this post. I will come back to read more tutorials. Buy rubber matting from http://rubbermattingexperts.co.uk/rubber-matting.html

340. Eleston -- 2018-01-17 12:22

I am very happy to visit in the post. I see in the post some C/C++ trick: static string hash generation working for him. I am appreciate in the work. keep it up. Click here http://rubbertiles-uk.co.uk/

341. <a href="https://www.gulfdegreeonline.org/mba-in-dubai/">MBA In Dubai</a> -- 2018-01-19 12:21

Usually! C and C++ are easy to remember rather than to dot net and IOS because these syntaxes is not to remember learner have to do more practice than remembering it, if he/she does not connect the current trend so they will not stay in the market because the everyday trend has been changed in designing and development too!

342. anonymous -- 2018-01-22 04:46

The purpose of getting support in computer education as it works for ideal support for needy acts. The matter of support is for good and ideal needy support. This seen in site https://www.essayuniverse.net/ for good use.

343. davidmiky -- 2018-03-17 08:13

What a amazing post shared here. I really love this website, I would like to say thanks for sharing such great post. I would like to say bundle of thanks for sharing. test4actual 210-260

344. anonymous -- 2018-03-25 13:20

After reading this my takeaway is https://paperwritingservice.reviews/penmypaper-review/ you have to stand up to life and it’s many obstacles that are personal to you. We all have challenges and struggles but fighting back and seeing a different perspective is all we need to find the resolve to overcome.

345. anonymous -- 2018-04-28 08:17

I appreciate your efforts in preparing this post. I really like your blog articles. https://caseanswers.xyz/product-red-a-case-study-solution-%26-analysis/ Amazing article thanks or sharing.. (PRODUCT) RED (A) Case Study Solution & Analysis

346. anonymous -- 2018-04-28 08:35

Dissertation Guidance Provides quality Online Dissertation Help for students. This a good way to appreciate the teacher as they put their efforts to train students. https://casegurus.com/hubspot-inbound-marketing-and-web-20-166 UK dissertation Writers appreciates the teachers. HubSpot: Inbound Marketing and Web 2.0 Case Study Solution

347. anonymous -- 2018-04-28 11:31

Dissertation Guidance Provides quality Online Dissertation Help for students. https://www.casequiz.com/helicopter-investigation-24333/ Amazing article thanks or sharing.. Helicopter Investigation Case Study Solution & Analysis

348. anonymous -- 2018-04-28 12:16

I personally like your post; you have shared good insights and experiences. https://www.examonlinehelp.xyz/take-online-english-quiz-21338 Keep it up. Do My Online English Quiz

349. anonymous -- 2018-04-28 12:53

Really i appreciate the effort you made to share the knowledge. This is really a great stuff for sharing. https://www.finance-assignments.com/decision-making-3072 Keep it up . Thanks for sharing. Decision Making Assignment Help

350. anonymous -- 2018-04-28 13:27

This was a great and interesting article to read. I have really enjoyed all of this very cool information https://www.homeworkaustralia.com/dental-assignment-help-21522 Dental Homework Help Australia

351. anonymous -- 2018-04-28 13:58

Things are very open and intensely clear explanation of issues. was truly information. http://phphelponline.com/php-mysql-assignment-help-12779 Your website is very beneficial. Php MySQL Project Help

352. anonymous -- 2018-04-28 14:24

Such a nice post, keep up the fantastic work https://structuredsettlementscompanies.xyz/ This site and the resources you provide is really nice keep it up. structured settlement companies boca raton

353. Assignment Help -- 2018-05-11 10:49

When you must switch to the World Wide Web to search for some Assignment help. Our high expert's professionals who have extensive experience in writing assignment will not only make you avail with the quality assignment, but you will also get plagiarism free assignment within the given time. https://www.allassignmenthelp.com/

Add New Comment