* Pointer to a char
@ 2012-09-18 9:29 Randi Botse
2012-09-18 10:29 ` Phil Sutter
2012-09-19 1:04 ` Jon Mayo
0 siblings, 2 replies; 7+ messages in thread
From: Randi Botse @ 2012-09-18 9:29 UTC (permalink / raw)
To: linux-c-programming
Hi, having coding in C for 3 years but I'm still not clear with this one.
Consider this code.
...
char *p;
unsigned int i = 0xcccccccc;
unsigned int j;
p = (char *) &i;
printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]);
memcpy(&j, p, sizeof(unsigned int));
printf("%x\n", j);
...
Output:
ffffffcc ffffffcc ffffffcc ffffffcc
0xcccccccc
My questions are:
1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is
unsigned char* then it will print correctly "cc cc cc cc")
2. Why pointer to char p copied to j correctly, why not every member
in p overflow? since it is a signed char.
Regards.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Pointer to a char 2012-09-18 9:29 Pointer to a char Randi Botse @ 2012-09-18 10:29 ` Phil Sutter 2012-09-18 10:33 ` Duan Fugang-B38611 2012-09-19 1:04 ` Jon Mayo 1 sibling, 1 reply; 7+ messages in thread From: Phil Sutter @ 2012-09-18 10:29 UTC (permalink / raw) To: Randi Botse; +Cc: linux-c-programming Hi, On Tue, Sep 18, 2012 at 04:29:32PM +0700, Randi Botse wrote: > ... > char *p; > unsigned int i = 0xcccccccc; > unsigned int j; > > p = (char *) &i; > printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]); > > memcpy(&j, p, sizeof(unsigned int)); > printf("%x\n", j); > ... > > Output: > > ffffffcc ffffffcc ffffffcc ffffffcc > 0xcccccccc > > > My questions are: > > 1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is > unsigned char* then it will print correctly "cc cc cc cc") This is because of the two's complement in which singed absolute values are stored internally. Since %x is a conversion of an integer, signed extension of the passed char happens, which in two's complement means that the leading bit is replicated to fill the upper bits. (0xC is 1100 in binary). > 2. Why pointer to char p copied to j correctly, why not every member > in p overflow? since it is a signed char. I am not quite sure about what the question is here (maybe caused by the lack of verbs in your sentence). Keep in mind that memcpy() only copies the memory, irrespective of the pointer type passed. Also, sizeof(unsigned int) == sizeof(int). HTH, Phil ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Pointer to a char 2012-09-18 10:29 ` Phil Sutter @ 2012-09-18 10:33 ` Duan Fugang-B38611 0 siblings, 0 replies; 7+ messages in thread From: Duan Fugang-B38611 @ 2012-09-18 10:33 UTC (permalink / raw) To: Phil Sutter, Randi Botse; +Cc: linux-c-programming Thanks, Phil, It is great for the detail explain. Best Regards, Andy -----Original Message----- From: linux-c-programming-owner@vger.kernel.org [mailto:linux-c-programming-owner@vger.kernel.org] On Behalf Of Phil Sutter Sent: Tuesday, September 18, 2012 6:30 PM To: Randi Botse Cc: linux-c-programming Subject: Re: Pointer to a char Hi, On Tue, Sep 18, 2012 at 04:29:32PM +0700, Randi Botse wrote: > ... > char *p; > unsigned int i = 0xcccccccc; > unsigned int j; > > p = (char *) &i; > printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]); > > memcpy(&j, p, sizeof(unsigned int)); > printf("%x\n", j); > ... > > Output: > > ffffffcc ffffffcc ffffffcc ffffffcc > 0xcccccccc > > > My questions are: > > 1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is > unsigned char* then it will print correctly "cc cc cc cc") This is because of the two's complement in which singed absolute values are stored internally. Since %x is a conversion of an integer, signed extension of the passed char happens, which in two's complement means that the leading bit is replicated to fill the upper bits. (0xC is 1100 in binary). > 2. Why pointer to char p copied to j correctly, why not every member > in p overflow? since it is a signed char. I am not quite sure about what the question is here (maybe caused by the lack of verbs in your sentence). Keep in mind that memcpy() only copies the memory, irrespective of the pointer type passed. Also, sizeof(unsigned int) == sizeof(int). HTH, Phil -- To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pointer to a char 2012-09-18 9:29 Pointer to a char Randi Botse 2012-09-18 10:29 ` Phil Sutter @ 2012-09-19 1:04 ` Jon Mayo 2012-09-19 7:59 ` Randi Botse 1 sibling, 1 reply; 7+ messages in thread From: Jon Mayo @ 2012-09-19 1:04 UTC (permalink / raw) To: Randi Botse; +Cc: linux-c-programming On Tue, Sep 18, 2012 at 2:29 AM, Randi Botse <nightdecoder@gmail.com> wrote: > Hi, having coding in C for 3 years but I'm still not clear with this one. > Consider this code. > > ... > char *p; > unsigned int i = 0xcccccccc; > unsigned int j; > > p = (char *) &i; > printf("%.2x %.2x %.2x %.2x\n", *p, p[1], p[2], p[3]); > printf (and other var arg functions) don't take char, short or float. they take int or double and a few other types. those [signed] chars are going to get sign extended when they are converted to signed int. (0xcc = -52 ) > memcpy(&j, p, sizeof(unsigned int)); the data at i, pointed to by p has not changed, so this memcpy works. The only thing that is weird is how you interpreted the data (in your printf above). > printf("%x\n", j); > ... > > Output: > > ffffffcc ffffffcc ffffffcc ffffffcc > 0xcccccccc > > > My questions are: > > 1. Why it prints "ffffffcc ffffffcc ffffffcc ffffffcc"? (if p is > unsigned char* then it will print correctly "cc cc cc cc") > 2. Why pointer to char p copied to j correctly, why not every member > in p overflow? since it is a signed char. > > Regards. > -- > To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pointer to a char 2012-09-19 1:04 ` Jon Mayo @ 2012-09-19 7:59 ` Randi Botse 2012-09-19 8:47 ` Leon Shaw 2012-09-19 18:09 ` Jon Mayo 0 siblings, 2 replies; 7+ messages in thread From: Randi Botse @ 2012-09-19 7:59 UTC (permalink / raw) To: linux-c-programming Hi Phil, Jon Thanks, now I'm clear with this, assignment doesn't care with type modifier. Code such as unsigned int j = 0xffeeddcc; int i = j; Both has the same value depending on how them interpreted (is this assumption correct?) Because, printf("%u", i) will be different to printf("%i", i) - but - printf("%u", i) wlll be same as printf("%u", j) Actually why asking this because I often see a pointer to a char* cast Let me show you with this example. Consider some structures... struct a_data { unsigned char f1[4]; unsigned char f2[6]; unsigned short f3[2]; }; and another struct named b_data, c_data, etc. Then there is a general function to process all type of structure, maybe something like this: int process_data(char *buffer, size_t len); Then if we cast for example a pointer to a_data struct to a char* as follow: struct a_data a; process_data((char*) &a, sizeof(a)); I though since it was cast to char*, the cast is "problem" because every signed char buffer will have a range CHAR_MIN to CHAR_MAX, therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char overflow) I think process_data() should be declared with int process_data(unsigned char *buffer, size_t len) this declaration in seem correct and work for me. However, now I'm conceptually understand why this works. Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pointer to a char 2012-09-19 7:59 ` Randi Botse @ 2012-09-19 8:47 ` Leon Shaw 2012-09-19 18:09 ` Jon Mayo 1 sibling, 0 replies; 7+ messages in thread From: Leon Shaw @ 2012-09-19 8:47 UTC (permalink / raw) To: Randi Botse; +Cc: linux-c-programming On Wed, Sep 19, 2012 at 3:59 PM, Randi Botse <nightdecoder@gmail.com> wrote: > Hi Phil, Jon > > Thanks, now I'm clear with this, assignment doesn't care with type modifier. > > Code such as > > unsigned int j = 0xffeeddcc; > int i = j; > > Both has the same value depending on how them interpreted (is this > assumption correct?) > According to C99, when applying integer conversion, "if the new type is signed and the value cannot be represented in it, either the result is implementation-defined or an implementation-defined signal is raised". But most implementation keeps the same memory representation. > Because, > > printf("%u", i) will be different to printf("%i", i) > - but - > printf("%u", i) wlll be same as printf("%u", j) > > > Actually why asking this because I often see a pointer to a char* cast > > Let me show you with this example. > Consider some structures... > > struct a_data { > unsigned char f1[4]; > unsigned char f2[6]; > unsigned short f3[2]; > }; > > and another struct named b_data, c_data, etc. > > Then there is a general function to process all type of structure, > maybe something like this: > > int process_data(char *buffer, size_t len); > > Then if we cast for example a pointer to a_data struct to a char* as follow: > > struct a_data a; > process_data((char*) &a, sizeof(a)); > > I though since it was cast to char*, the cast is "problem" because > every signed char buffer will have a range CHAR_MIN to CHAR_MAX, > therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char > overflow) > Actually, whether char is signed or unsigned is implementation-defined, though, normally, it is signed. SCHAR_MAX+1 ~ UCHAR_MAX can be mapped to SCHAR_MIN ~ -1. For a pointer that denotes a memory region, what type it points to doesn't cause much problem as long as you don't simply dereference it. In such cases, void * might be less confusing. Regards, Leon > I think process_data() should be declared with > > int process_data(unsigned char *buffer, size_t len) > > this declaration in seem correct and work for me. > > However, now I'm conceptually understand why this works. > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Pointer to a char 2012-09-19 7:59 ` Randi Botse 2012-09-19 8:47 ` Leon Shaw @ 2012-09-19 18:09 ` Jon Mayo 1 sibling, 0 replies; 7+ messages in thread From: Jon Mayo @ 2012-09-19 18:09 UTC (permalink / raw) To: Randi Botse; +Cc: linux-c-programming On Wed, Sep 19, 2012 at 12:59 AM, Randi Botse <nightdecoder@gmail.com> wrote: > Hi Phil, Jon > > Thanks, now I'm clear with this, assignment doesn't care with type modifier. > > Code such as > > unsigned int j = 0xffeeddcc; > int i = j; > > Both has the same value depending on how them interpreted (is this > assumption correct?) > > Because, > > printf("%u", i) will be different to printf("%i", i) > - but - > printf("%u", i) wlll be same as printf("%u", j) > > most architectures will work that way. some are a little nutty, but standard C allows for implementation defined behavior when you interpret a data type the wrong way. (it gets pretty specific about signed versus unsigned representations) I will readily admit that years of FORTH programming has warped my mind and I no longer worry too much about signed int and unsigned int. I tend to think more in terms of how big a data type is. The 'union' keyword is especially useful for dealing with different ways to interpret the same sized piece of memory. float is often the same size as int. so this potentially works on some platforms: float f = 1; int i = *(int*)&f; printf("%u", i); it would print some weird number that shows you how dramatic an internal representation can differ if you manage to interpret it incorrectly. (this trick is often used to dump float values in hexidecimal "%x" for debugging purposes) > Actually why asking this because I often see a pointer to a char* cast > > Let me show you with this example. > Consider some structures... > > struct a_data { > unsigned char f1[4]; > unsigned char f2[6]; > unsigned short f3[2]; > }; > > and another struct named b_data, c_data, etc. > > Then there is a general function to process all type of structure, > maybe something like this: > > int process_data(char *buffer, size_t len); > I would have made process_data take a void * instead, so people wouldn't have to hack around C's simple type checking with casts. casting struct a_data* to char* doesn't change the value of the pointer. if you ignore compiler warnings it will work without the cast. now inside process_data, the char* type is useful, because the pointer math will use sizeof(char) [which is always 1] for calculating offsets. while your sizeof(struct a_data) will be around 14 bytes. Some people don't like to use void* here, because the compiler will not like pointer math done on a void* as sizeof(void) doesn't make sense. Old compilers hacked around this by treating it as 1. New compilers will prefer that you cast or load the void* into a char* (which is how i usually implement these sorts of functions) > Then if we cast for example a pointer to a_data struct to a char* as follow: > > struct a_data a; > process_data((char*) &a, sizeof(a)); > > I though since it was cast to char*, the cast is "problem" because > every signed char buffer will have a range CHAR_MIN to CHAR_MAX, > therefore value of CHAR_MAX to UCHAR_MAX will broken (signed char > overflow) > casting to a pointer won't alter the data. it just changes how you would interpreter the data when dereferencing it. if process_data doesn't dereference, then there is probably not a problem. (also char can be signed or unsigned. in gcc you could use something like -funsigned-char to override the default setting. which can potentially break a lot of assumptions in your system and library headers) > I think process_data() should be declared with > > int process_data(unsigned char *buffer, size_t len) > you should use: signed char * - if you need signed unsigned char * - if you need unsigned char * - if you don't care either way. as long as the pointer points to something char-sized. void * - if you don't even care about what type it points to. (maybe a struct) note- this rule is different than signed/unsigned int. int is always signed. I use char* when dealing with strings, because I won't be using them in situations where negative values could be a problem. but one terrible issue you can run into is a simple function like this: int isupper(char c) { const int upper_table[256] = { ... }; /* UCHAR_MAX is more appropriate here. */ return upper_table[c]; /* oops what if c is negative, that would be a terrible array index. */ /* we would actually want to cast c to unsigned char, or at least check x >= 0 && x < upper_table_len */ } > this declaration in seem correct and work for me. > > However, now I'm conceptually understand why this works. > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-c-programming" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-09-19 18:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-18 9:29 Pointer to a char Randi Botse 2012-09-18 10:29 ` Phil Sutter 2012-09-18 10:33 ` Duan Fugang-B38611 2012-09-19 1:04 ` Jon Mayo 2012-09-19 7:59 ` Randi Botse 2012-09-19 8:47 ` Leon Shaw 2012-09-19 18:09 ` Jon Mayo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).