Search

Sponsored Links

Meta

Categories

Archives

Recent Posts

RSS Feeds

03
Nov

Inline functions & Macros comparision

Inline functions :

 

 

  • processed at compile time
  • Its just a hint to comiler, compiler may ignore if it found code is not suitable for inlining
  • no side affects
  • easy to understand
  • type checking applied to function args
  • If u are trying to take address of a inline function, inlining will be ignored by compiler
  • If ur inlined code contains complex loops and all, inlining will be ignored
  • Ignoring inlining depends on comiler

 

Macros :

  • processed at preprocessor stage
  • produce unknown side affects
  • not that easy to understand
  • no type checking appiled for function args

 

Are macros useless

U can pass any type of argument to a macro (which can be achieved by using templates)

[tags]inline functions, macros, macros inline function comparison, macros inline functions which is better, inline functions explained[/tags]

Popularity: 5%

03
Nov

C Structure Padding

  • to access a structure member fastly, structure need to be padded
  • Padding will be done for each and every member, if required, not for the entire structure
  • Alignment : data storing boundary address starts at aligned address

Examples:

Size of int = 4 bytes

size of char = 1byte

alignment 64 bit (4 bytes)

Struct

{

int a ;

char b ;

} ;

size of struct = 8

Struct

{

int a ;

char b ;

int c ;

short int d ;

char e ;

} ;

size of struct = 16

+—————————-+
| a | b(3) | c | d e (1)|
+—————————-+
() padded bytes

Is there any way to supress this padding ?

Yes, using pragmas , typically pragma look like this

#pragma pack(1)

Why padding ?

For faster access to the members.

Popularity: 13%

02
Nov

Multithreaded Programming using POSIX pthreads – Part 2

We have discussed very basics of multithreading in Part1, now before we actually go to Posix threads programming, we dig bit further in how to program an application using multiple threads and which applications require threads because we should not waste our effort and time by multithreading a program that is not worth multithreading.
Designing threaded programs
In order for a program to take advantage of threads, it must be able to be organized into discrete, independent tasks which can execute concurrently. For example, if routine1 and routine2 can be interchanged, interleaved and/or overlapped in real time, they are candidates for threading.
Tasks that may be suitable for threading include tasks that:

  • Block for potentially long waits
  • Use many CPU cycles
  • Must respond to asynchronous events
  • Are of lesser or greater importance than other tasks
  • Are able to be performed in parallel with other tasks

Be careful if your application uses libraries or other objects that don’t explicitly guarantee thread-safeness. When in doubt, assume that they are not thread-safe until proven otherwise.
Thread-safeness: refers an application’s ability to execute multiple threads simultaneously without “clobbering” shared data or creating “race” conditions. For example, suppose that you use a library routine that accesses/modifies a global structure or location in memory. If two threads both call this routine it is possible that they may try to modify this global structure/memory location at the same time. If the routine does not employ some sort of synchronization constructs to prevent data corruption, then it is not thread-safe. The implication to users of external library routines is that if you aren’t 100% certain the routine is thread-safe, then you take your chances with problems that could arise.
The other terms you need to look at when using libraries/functions in threaded applications are
Reentrant code means that a program can have more than one thread executing concurrently? or a function get executed by itself after one or more function calls, example is a()->b()->x()->a(), in this a() is called again, if a() is not reentrant then we could be in troubles.
Async-safe means that a function is reentrant while handling a signal (e.g. can be called from a signal handler)
Concurrency vs. Parallelism - They are not the same! Parallelism is a subset of Concurrency. Parallelism implies simultaneous running of code (which is impossible on uniprocessor machines) while Concurrency implies that many tasks can run in any order and possibly in parallel.
Thread Design Patterns:

  • Manager/worker: a single thread, the manager assigns work to other threads, the workers. Typically, the manager handles all input and parcels out work to the other tasks. At least two forms of the manager/worker model are common: static worker pool and dynamic worker pool.
  • Pipeline: a task is broken into a series of suboperations, each of which is handled in series, but concurrently, by a different thread. An automobile assembly line best describes this model.
  • Peer: similar to the manager/worker model, but after the main thread creates other threads, it participates in the work.

Protecting Shared Resources

Because most useful applications are more complex in general, we must synchronize our threads and protect globally shared data across multiple threads.

Mutual Exclusion

Mutual exclusion is the method of serializing access to shared resources. You do not want a thread to be modifying a variable that is already in the process of being modified by another thread! Another scenario would be a dirty read where the value is in the process of being updated and another thread reads an old value.

A mutex is a lock that guarantees three things:

  1. Atomicity - Locking a mutex is an atomic operation, meaning that the operating system (or threads library) assures you that if you locked a mutex, no other thread succeeded in locking this mutex at the same time.
  2. Singularity - If a thread managed to lock a mutex, it is assured that no other thread will be able to lock the thread until the original thread releases the lock.
  3. Non-Busy Wait - If a thread attempts to lock a thread that was locked by a second thread, the first thread will be suspended (and will not consume any CPU resources) until the lock is freed by the second thread. At this time, the first thread will wake up and continue execution, having the mutex locked by it.

From these three points we can see how a mutex can be used to assure exclusive access to variables (or in general critical code sections). Here is some pseudo-code that updates the two variables we were talking about in the previous section, and can be used by the first thread:

lock mutex 'X1'.
set first variable to '0'.
set second variable to '0'.
unlock mutex 'X1'.

Meanwhile, the second thread will do something like this:

lock mutex 'X1'.
set first variable to '1'.
set second variable to '1'.
unlock mutex 'X1'.

Assuming both threads use the same mutex, we are assured that after they both ran through this code, either both variables are set to ‘0′, or both are set to ‘1′. You’d note this requires some work from the programmer - If a third thread was to access these variables via some code that does not use this mutex, it still might mess up the variable’s contents (this is just a framework to program in multi thread environment, right? If you use it properly it yields good results, otherwise you get screwed). Thus, it is important to enclose all the code that accesses these variables in a small set of functions, and always use only these functions to access these variables.
The code between the lock and unlock calls to the mutex, is referred to as the critical section. Minimizing time spent in the critical section allows for greater concurrency because it reduces the time other threads must wait to gain the lock. Therefore, it is important for a thread programmer to minimize critical sections.

Problems with Mutexes

An important problem associated with mutexes is the possibility of deadlock. A program can deadlock if two (or more) threads have stopped execution or are spinning permanently. The simplest deadlock situation: thread 1 locks lock A, thread 2 locks lock B, thread 1 wants lock B and thread 2 wants lock A. Instant deadlock. You can prevent this from happening by making sure threads acquire locks in an agreed order (lock ordering). Deadlock can also happen if threads do not unlock mutexes properly.

Race conditions occur when multiple threads share data and at least one of the threads accesses the data without going through a defined synchronization mechanism (Nichols 203). This could result in erroneous results even in an inconsistent manner which makes race conditions particularly difficult to debug. Library calls outside of your program’s control are common culprits. Make sure you take steps within your program to enforce serial access to shared file descriptors and other external resources. On most Solaris man pages, you can find out if your library call is safe to use in reentrant code. Towards the bottom of the man page, you will see Categories of MT Library Calls. MT Safe means that the function can be called concurrently from different threads. MT Hot are “fast” MT Safe functions (usually not found on man pages). MT Unsafe means that the function cannot be called concurrently. Alternative means that there are MT Safe equivalents (e.g. gethostbyname() and gethostbyname_r()).

Another problem with mutexes is that contention for a mutex can lead to priority inversion. A higher priority thread can wait behind a lower priority thread if the lower priority thread holds a lock for which the higher priority thread is waiting. This can be eliminated/reduced by limiting the number of shared mutexes between different priority threads.

Thread Synchronization Primitives

Mutexes are one method of synchronizing threads, however, there are many other ways.

Synch Technique I :: Condition Variables

A popular method of synchronizing multiple threads is through the use of condition variables. Condition variables allow threads to synchronize to a value of a shared resource. Condition variables provide a kind of notification system among threads.
For example, you could have a global counter, and once it reaches a certain count a thread activates. The thread that activates once the counter reaches the limit would wait on the condition variable. Other threads signal this condition variable if you want threads waiting/sleeping on this condition variable to wakeup. You can also use broadcast if you want to signal all threads waiting on the condition variable to wakeup. This sounds a bit confusing, but the pthread example below will clarify how condition variables work.

When waiting on condition variables, the wait should be inside a loop, not in a simple if statement because of spurious wakeups. You are not guaranteed that if a thread wakes up, it is the result of a signal or broadcastcall.

Synch Technique II :: Reader/Writer Locks

It is sometimes useful to allow multiple readers (threads) to read a shared resource variable without having to wait for locks if no thread is writing to the resource. With a multiple reader lock this is possible. Readers can enter the lock and do their business while writers wait until there are no readers left in the lock. Then the writer can modify the value while blocking all new readers from entering the lock.

Writer starvation is possible with the many reader/single writer lock technique. A priority system can eliminate writer starvation. For more information on r/w locks, see Nichols pgs. 84-89.

Synch Technique III :: Spinlocks

Spinlocks are less commonly used at the user-level. Spinlocks are used frequently in the Linux kernel itself. A spinlock will basically spin on a mutex. If a thread cannot obtain the mutex, it will keep polling the lock until it is free. The advantages to this is that if a thread is about to give up a mutex, you don’t have to context switch to another thread. This situation is a bit tricky because you don’t know when this might occur, and long spin times will result in poor performance.

Spinlocks should never be used on uniprocessor machines. Why is this?

Synch Technique IV :: Semaphores

Semaphores are another type of synchronization primitive that come in two flavors: binary and counting. Binary semaphores act much like mutexes, while counting semaphores can behave as recursive mutexes. Counting semaphores can be initialized to any arbitrary value which should depend on how many resources you have available for that particular shared data. Many threads can obtain the lock simultaneously until the limit is reached. This is referred to as lock depth.

Semaphores are more common in multiprocess programming (i.e. it’s usually used as a synch primitive between processes).
Now we are ready for programming and know about pthreads API.

Popularity: 9%

02
Nov

Multithreaded Programming using POSIX pthreads – Part 1


?

Introduction

Most of todays code is sequential/serialized, i.e. code is executed instruction by instruction, one after next. As humans tend to think sequentially, they put their thought process into programs, which essentially reflects their thought process. But the demand for multithreaded programming is increasing as to realize the advantage of Symmetric-Multiprocessing machines. You might have got a doubt, whether multithreaded program is useless on uniprocessing machines? The answer is even on uniprocessing machines they are helpful, say one thread blocked by a blocking call, other threads can use the CPU. However multiprocessing machine can yield full potential.
What is a Thread?
A thread is a sequence of instructions that can be executed in parallel with other threads. A thread is a light weight process, that means thread doesnt take as much space taken by a process, it light weight, of course its behavior is also little different from process. Before talking about threads in detail, lets talk about processes a bit.
A process is created by the operating system, and requires a fair amount of “overhead”. Processes contain information about program resources and program execution state, including:
Process ID, process group ID, user ID, and group ID
Environment
Working directory.
Program instructions
Registers
Stack
Heap
File descriptors
Signal actions
Shared libraries
Inter-process communication tools (such as message queues, pipes, semaphores, or shared memory).
Processes are scheduled by operating system. A process has these fundamental parts: code (”text”), data (VM), stack, file I/O, and signal tables. A typical process is shown below in fig1.

tfig1

So, we said processes have fair amount of overhead, so whats the overhead? Heavy-weight processes? have a significant amount of overhead when switching: all the tables have to be flushed from the processor for each task switch. Also, the only way to achieve shared information between processes is through pipes and “shared memory”. If a process spawns a child entire process space is duplicated. Cloning/Spawning processes take huge resources as processes are heavy weight.
So, now how threads can overcome this difficulty? Threads reduce overhead by sharing fundamental parts. By sharing these parts, switching happens much more frequently and efficiently.

?

Threads use and exist within the process resources, yet are able to be scheduled by the operating system and run as independent entities largely because they duplicate only the bare essential resources that enable them to exist as executable code.
This independent flow of control is accomplished because a thread maintains its own:

  • Stack pointer
  • Registers
  • Scheduling properties (such as policy or priority)
  • Set of pending and blocked signals
  • Thread specific data.

Threads in a process shown in fig2.
? tfig2

Types of Threads
There are two types of threads
Kernel-level threads

User-level threads
Kernel-Level Threads:
Kernel level threads consist of a set of registers, a stack, and a few corresponding kernel data structures. When kernel threads are used, the operating system will have a descriptor for each thread belonging to a process and it will schedule all the threads. Unlike processes, all threads within a process share the same address space. Similar to processes, when a kernel thread makes a blocking call, only that thread blocks.
The advantage of kernel threads over processes is faster creation and context switching compared with processes. For shared-memory multiprocessor architectures, the kernel is able to dispatch threads of one process on several processors, which leads to automatic load balancing within the nodes. For parallel programming, threads allow different parts of the parallel program to communicate by directly accessing each others’ memory, which allows very efficient, fine-grained communication.
Kernel threads share a single copy of the entire address space, including regions such as global data that may cause conflicts if used by multiple threads simultaneously. Threads can also cause unintentional data sharing, which leads to corruption and race conditions. To avoid this unintentional sharing, programs must often be modified to either lock or access separate copies of common data structures. Several very widely used language features are unsafe when used with threads, such as the use of global and static variables, or the idiom of returning a reference to a static buffer. Especially with large existing codebases with many global variables, this makes kernel threads very difficult to use because in most implementations of kernel threads, it is not possible to assign each thread a private set of global variables.
Kernel threads are considered “lightweight,” and one would expect the number of threads to only be limited by address space and processor time. Since every thread needs only a stack and a small data structure describing the thread, in principle this limit should not be a problem. But in practice, we found that many platforms impose hard limits on the maximum number of pthreads that can be created in a process.
In particular, operating system kernels tend to see kernel threads as a special kind of process rather than a unique entity. For example, in the Solaris kernel threads are called “light weight processes” (LWP’s). Linux actually creates kernel threads using a special variation of fork called “clone,” and until recently gave each thread a separate process ID. Because of this heritage, in practice kernel threads tend to be closer in memory and time cost to processes than user-level threads, although recent work has made some progress in closing the gap, including the Native POSIX Threading Library (NPTL) and Linux O(1) scheduler.


User-Level Threads:
Like a kernel thread, a user-level thread includes a set of registers and a stack, and shares the entire address space with the other threads in the enclosing process. Unlike a kernel thread, however, a user-level thread is handled entirely in user code, usually by a special library that provides at least start, swap and suspend calls. Because the OS is unaware of a user-level thread’s existence, a user-level thread can not separately receive signals or use operating system scheduling calls such as sleep(). Many implementations of user-level threads exist, including: GNU Portable Threads (Pth)? , FreeBSD’s userland threads, QuickThreads and those developed by us for the Charm++ system.
The primary advantages of user-level threads are efficiency and flexibility. Because the operating system is not involved, user-level threads can be made to use very little memory, and can be created and scheduled very quickly. User-level threads are also more flexible because the thread scheduler is in user code, which makes it much easier to schedule threads in an intelligent fashion — for example, the application’s priority structure can be directly used by the thread scheduler.
The primary disadvantage of user-level threads compared to kernel threads is the lack of operating system support. For example, when a user-level thread makes a blocking call, the kernel does not start running another user-level thread. Instead, the kernel suspends the entire calling kernel thread or process, even though another user-level thread might be ready to run. To avoid this blocking problem, some systems such as AIX and Solaris support “N:M” thread scheduling, which maps some number of application threads onto a (usually smaller) number of kernel entities. There are two parties, the kernel and the user parts of the thread system, involved in each thread operation for N:M threading, which is complex. The blocking problem can also be avoided by building a smarter runtime layer which intercepts blocking calls, replaces them with a non-blocking call, and starts another user-level thread while the call proceeds. Yet another approach is to provide support in the kernel to notify the user-level scheduler when a thread blocks, often called “scheduler activations”.
Since user-level threads are controlled in user code, there is virtually no limit on the maximum number of threads as long as the resource allows. In practice, one can create 50,000 user-level threads on a Linux box very easily.
Combination
Some implementations support both user- and kernel-level threads. This gives the advantages of each to the running task. However, since Linux’s kernel-level threads nearly perform as well as user-level, the only advantage of using user-threads would be the cooperative multitasking.

?

Popularity: 10%

01
Nov

Find Max and Min of two integers without branching

FindMinMax(int x,int y)
{
?  int min;
?  int max;
?  /* To find Minimum of x and y */
?  min = y + ((x - y) & -(x < y));
?  /* To find Maximum of x and y */
?  max = x - ((x - y) & -(x < y));
}

Popularity: 3%

01
Nov

Convert Unicode to Utf8

[highlight lang=”C”]

int UnicodeToUTF8(const WCHAR *Src, int SrcLen, unsigned char *Dest, int DestLen)
{
int i=0;
int outputlen=0; /*bytes */
char tchar;

if (!src || !dest)
{
return outputlen;
}

for ( i=0; i {
if ( outputlen >= DestLen )
{
//overflow detected
break;
}

if ( 0xdfff > Src[i] && 0xd800 < Src[i] )
{
tchar = (((Src[i] & 0×3c0) >> 6) + 1);
Dest[outputlen++] = (tchar >> 2 | 0xf0);
Dest[outputlen++] = ((tchar & 0×03 | 0×80) | (Src[i] & 0×3e) >> 2);
}
else if ( 0×800 > Src[i] )
{
Dest[outputlen++]= (Src[i] >> 6 | 0xc0);
Dest[outputlen++]= (Src[i] & 0×3f | 0×80);
}
else if ( 0×80 > Src[i] )
{
strDest[outputlen++] = Src[i];
}
else
{
Dest[outputlen++] = (Src[i] >> 12 | 0xe0);
Dest[outputlen++] = (Src[i] >> 6 & 0×3f | 0×80);
Dest[outputlen++] = (Src[i] & 0×3f | 0×80);
}
}

Dest[outputlen] = ?\0?;

return outputlen;
}

[/highlight]

[tags]utf8, unicode, unicode to utf8, unicode to utf8 conversion, unicode to utf8 conversion program[/tags]

Popularity: 9%

01
Nov

Swap Bits in a Byte without using any extra variable

SwapBitsInByte(unsigned char n)
{
  n = (n & 0xF0) >> 4 | (n & 0x0F) < < 4;
  n = (n & 0xCC) >> 2 | (n & 0×33) < < 2;
  n = (n & 0xAA) >> 1 | (n & 0×55) << 1;
  return n;
}

download this code

Popularity: 13%

« Previous Page