C语言中,把GB2312编码的字符串转换为汉字的函数,不用iconv库函,加上测试代码
时间: 2023-04-05 18:01:11 浏览: 115
您好,以下是将GB2312编码的字符串转换为汉字的函数及测试代码:
```c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// 将GB2312编码的字符串转换为汉字
char* gb2312_to_utf8(const char* gb2312_str) {
int len = strlen(gb2312_str);
char* utf8_str = (char*)malloc(len * 3 + 1);
memset(utf8_str, 0, len * 3 + 1);
int i = 0, j = 0;
while (i < len) {
if (gb2312_str[i] < 0x80) {
utf8_str[j++] = gb2312_str[i++];
} else {
unsigned char c1 = gb2312_str[i++];
unsigned char c2 = gb2312_str[i++];
unsigned short u = (c1 << 8) | c2;
if (u >= 0xA1A1 && u <= 0xA9FE) {
u = (u - 0xA1A1) + 0x4E00;
} else if (u >= 0xB0A1 && u <= 0xF7FE) {
u = (u - 0xB0A1) + 0x8140;
}
utf8_str[j++] = 0xE0 | (u >> 12);
utf8_str[j++] = 0x80 | ((u >> 6) & 0x3F);
utf8_str[j++] = 0x80 | (u & 0x3F);
}
}
return utf8_str;
}
int main() {
const char* gb2312_str = "\xb2\xe2\xca\xd4\xca\xb1\xbc\xe4";
char* utf8_str = gb2312_to_utf8(gb2312_str);
printf("%s\n", utf8_str);
free(utf8_str);
return 0;
}
```
该函数将GB2312编码的字符串转换为UTF-8编码的字符串,其中包含了汉字。测试代码中使用了一个GB2312编码的字符串,将其转换为UTF-8编码的字符串并输出。
阅读全文