C++编程实践

本文记录使用C++过程中需要避免和容易犯的错误

Double free error

情况一

当类内包含new的内存时,在析构函数内对其进行delete。如果没有显式的提供深拷贝的构造函数,在使用时若是将该类按值传递给函数,则会出现double free的错误。解决方法可以是:

  1. 提供深拷贝的构造函数
  2. 对该类仅使用按引用传递
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>

using namespace std;

class A{
public:
  A() {
    array = new int [5];
  }
  // A(const A& a) {
  //   array = new int [5];
  // }

  A(const )
  ~A() {
    cout << "Destory" << endl;
    delete [] array;
  }
private:
  int* array;
};

void func(A a){
  return;
}

int main() {
  A a;
  func(a);  // this will occur double free error! Need add Copy construcot
  return 0;
}

类的内存释放

当在栈里面实例化一个类之后,该类的生存周期即被其生存周期所管理。有时候我们想提前释放类的内存时,可以通过重载操作符=,重新对这个实例化的类进行赋值,原先的类即会立即调用析构函数。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include<iostream>
using namespace std;
class B {
private:
  int* array;
  int len;
public:
  B() {
    len = 0;
    array = nullptr;
  }
    B(int n) {
      cout << "B is constructed!" << endl;
      array = new int [n];
      len = n;
    }
  B& operator= (B&& b) {
    if (array != nullptr) {
      delete array;
      array = nullptr;
    }
    if (b.get_n() != 0)
      array = new int [b.get_n()];
    return *this;
  }
  ~B() {
    delete [] array;
    cout << "destory" << endl;
  }
  int get_n() {
    return len;
  }
};
int main() {
  B b(6);
  b = B();
  // cout << b.get_n() << endl;
  return 0;
}
// 输出:
// B is constructed!
// destory
// destory

使用std的vector存储类实例

  1. 在vector内存储类的指针,需要自己负责内存的释放,因为其并不会自动调用类的析构函数

    注意:操作符*的优先级比操作符.更低,需要注意添加()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <vector>
using namespace std;
class A {
public:
  A() {
    arr = new int [2];
    arr[0] = 5;
    arr[1] = 8;
  }
  ~A() {
    cout << count << "time Destroy A!" << endl;
    count++;
  }
  void show() {
    cout << arr[0] << " " << arr[1] << endl;
  }
private:
  int* arr;
  static int count;
};
int A::count = 1;

int main() {
  vector<A*> v;
  for (int i = 0; i < 8; i++) {
    A* tmp = new A();
    v.push_back(tmp);
  }
  (*(v[0])).show();
  return 0;
}
=>
5 8
  1. 在vector内存储类实例是会产生大量的类的销毁和生成
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <iostream>
#include <vector>
using namespace std;
class A {
public:
  A() {
    arr = new int [2];
    arr[0] = 5;
    arr[1] = 8;
  }
  ~A() {
    cout << count << " time Destroy A!" << endl;
    count++;
  }
  void show() {
    cout << arr[0] << " " << arr[1] << endl;
  }
private:
  int* arr;
  static int count;
};
int A::count = 1;

int main() {
  vector<A*> v;
  for (int i = 0; i < 4; i++) {
    A tmp;
    v.push_back(tmp);
  }
  v[0].show();
  return 0;
}

=>
1 time Destroy A!
2 time Destroy A!
3 time Destroy A!
4 time Destroy A!
5 time Destroy A!
6 time Destroy A!
7 time Destroy A!
5 8
8 time Destroy A!
9 time Destroy A!
10 time Destroy A!
11 time Destroy A!

C++ 输出彩色字体(Linux)

不同颜色的输出主要依据格式ESC[*m,ESC的八进制为\033,*可以是多个属性的组合,用“,”隔开。

  1. printf 输出红色字体

    1
    2
    3
    4
    5
    6
    7
    
    #include <iostream>
    using namespace std;
    int main()
    {
     printf("\033[**31m**红色\033[**0m**");
     return 0;
    }
    
  2. cout 输出绿色字体

    1
    2
    3
    4
    5
    6
    7
    
    #include <iostream>
    using namespace std;
    int main()
    {
     cout  << "\033[32m修改\033[0m"<< endl ;
     return 0;
    }
    

常用控制码

\033[0m 关闭所有属性
\033[1m 高亮
\033[2m 亮度减半
\033[3m 斜体
\033[4m 下划线
\033[5m 闪烁
\033[6m 快闪
\033[7m 反显
\033[8m 消隐
\033[9m 中间一道横线
10-19 关于字体的
21-29 基本与1-9正好相反
30-37 设置前景色
40-47 设置背景色
30:
31:
32: 绿
33:
34: 蓝色
35: 紫色
36: 深绿
37 :白色
38 打开下划线,设置默认前景色
39 关闭下划线,设置默认前景色
40 黑色背景
41 红色背景
42 绿色背景
43 棕色背景
44 蓝色背景
45 品红背景
46 孔雀蓝背景
47 白色背景
48 不知道什么东西
49 设置默认背景色
50-89 没用
90-109 又是设置前景背景的,比之前的颜色浅
\033[nA 光标上移n行
\033[nB 光标下移n行
\033[nC 光标右移n行
\033[nD 光标左移n行
\033[y;xH 设置光标位置
\033[2J 清屏
\033[K 清除从光标到行尾的内容
\033[s 保存光标位置
\033[u 恢复光标位置
\033[?25l 隐藏光标
\033[?25h 显示光标

变量声明位置

在使用for循环语句时,如果变量是一个对象, 每次进入作用域都要调用其构造函数, 每次退出作用域都要调用其析构函数. 这会导致效率降低.

1
2
3
4
5
// 低效的实现
for (int i = 0; i < 1000000; ++i) {
    Foo f;                  // 构造函数和析构函数分别调用 1000000 次!
    f.DoSomething(i);
}

在循环作用域外面声明这类变量要高效的多:

1
2
3
4
Foo f;                      // 构造函数和析构函数只调用 1 次
for (int i = 0; i < 1000000; ++i) {
    f.DoSomething(i);
}

父类拥有map<>成员变量

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <fstream>
#include <vector>
#include <map>

using namespace std;
class A{
public:
  virtual void mins(int a, int b) {
    cout << "class A: " << a - b << endl;
  }
  virtual void plus(int a, int b) {
    cout << "class A: " << a << " " << b << endl;
  }
  map<string, int> coef;
  int *array;
};

class B : public A {
public:
  B(int a, int b) {
    coef["id"] = a;
    coef["name"] = b;
    int* A = new int[10];   //一定要在堆上申请空间
    for (int i = 0; i < 10; i++)
      A[i] = i;
    array = A;
  }
  void plus(int a, int b) {
    cout << "class B: " << a << " " << b << endl;
  }
};

int main(int argc, char *argv[])
{
  B* b = new B(22, 32);
  b->mins(4, 3);
  vector<A*> v;
  v.push_back(b);
  A* a = v[0];
  v[0]->plus(2, 3);
  cout << a->coef["name"] << endl;
  cout << a->coef["id"] << endl;
  for (int i = 0; i < 10; i++) {
    cout << a->array[i] << endl;
  }
  return 0;
}

父类拥有map<>成员变量,当子类转换为父类是依旧可以访问子类的变量。

vector

  1. 使用vector<int*>时,需要使用new来声明存储变量的空间

    1
    2
    3
    4
    
    void func(vector<int* > v) {
       int* tmp = new int;
       v.push_back(tmp);
    }
    
  2. Vector调用clear()之后,只会把size设置为0,而内存空间并没有释放。 vector 中的内建有内存管理,当 vector 离开它的生存期的时候,它的析构函数会把 vector 中的元素销毁,并释放它们所占用的空间,所以用 vector 一般不用显式释放 —— 不过,如果你 vector 中存放的是指针,那么当 vector 销毁时,那些指针指向的对象不会被销毁,那些内存不会被释

  3. vector.push_back 会调用拷贝构造函数

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    
    #include <cstdio>
    #include <cstdlib>
    #include <iostream>
    #include <fstream>
    #include <vector>
    #include <map>
    #include <memory>
       
    using namespace std;
       
    class A{
    public:
     A() {
       cout << "construct A" << endl;
     }
     A(const A& a) {
       cout << "copy A" << endl;
     }
     void operator =(const A& a) {
       cout << "= A" << endl;
     }
    };
       
    int main(int argc, char *argv[])
    {
     vector<A> v;
     A a1;
     v.push_back(a1); // 会调用拷贝构造函数
     A a2 = v[0];  // 会调用拷贝构造函数
     A& a3 = v[0]; // 不会调用拷贝构造函数
     A* a4 = &v[0]; //不会调用拷贝构造函数
     return 0;
    }
    
  4. vector 初始化(c++ 17)

    1
    2
    3
    4
    
    vector<int> t1 {3, 2, 4};
    vector<int> t2 {4, 3, 2};
    vector<vector<int>> v {t1, t2};
    v.push_back({3, 1});
    

全局变量

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// global.h
#pragma once
#include <iostream>
extern bool parallel;

// global.cpp
#include "B.h"
bool parallel = false;

// other file only need include global.h

临时对象的产生与应用

制造临时对象的方法,在型别名称之后直接加一对小括号shape(3,5), int(8)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#include <algorithm>
#include <iostream>
#include <vector>

using namespace std;

template <typename T>
class print {
 public:
  void operator()(const T& element) { cout << element << " "; }
};

int main() {
  vector<int> iv{0, 1, 2, 3, 4, 5};
  for_each(iv.begin(), iv.end(), print<int>());
  return 0;
}

静态常量整数成员在class内部直接初始化

1
2
3
4
5
template <typename T>
class testClass{
  public:
  static const int kConst = 5;
};

Roud_up to 8

将某个数上调至8的倍数

1
2
3
size_t N = 8;
sizt_t num = 13;
size_t num2 = (num + N - 1) & ~(N - 1)

型别推导

  1. 一般情况

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    template <class I, class T>
    void func_impl(I iter, T t) {
    T tmp;
    ...
    }
    template <class I>
    void func(I iter) {
    func_impl(iter, *iter);
    }
    
  2. 函数返回值推导

    1
    2
    3
    4
    5
    6
    7
    
    template <class T>
    struct MyIter {
    typedef T value_type;
    T* ptr;
    MyIter(T* p = 0) : ptr(p) {}
    T& operator* () const {return *ptr;}
    };
    

template typename I::value_type func(I ite) { return *ite; }

3. traits 编程
```c++
template <class I>
structiterator_traits {
  typedef typename I::value_type value_type;
}

// 采用偏特化解决传入int*,提取int
template <class T>
struct iterator_traits<T*> {
  typedef T value_type;
}

// 采用偏特化解决传入const int*,提取int
template <class T>
struct iterator_traits<const T*> {
  typedef T value_type;
}

template <class I>
typename iterator_traits<I>::value_type func(I ite) {
  return *ite;
}
  1. 正对不同的类型调用不同的函数实现

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    template <class InputIterator, class Distance>
    void advance(InputIterator& i, Distance n) {
    if (is_random_access_iterator(i))
     advance_RAI(i, n);
    else if (is_bidirectional_iterator(i))
     advance_BI(i, n);
    else
     advance_II(i, n);
    }
    

    像以上的实现,是在执行期才决定使用哪个版本的实现,会影响效率,最好能够在编译期就选择正确的版本,重载函数机制可以实现这个目标。

    1
    2
    3
    4
    5
    6
    
    // 五个座位标记用的型别(tag type)
    struct input_iterator_tag{};
    struct output_iterator_tag{};
    struct forward_iterator_tag: public input_iterator_tag {};
    struct bidirectional_iterator_tag : public forward_iterator_tag {};
    struct random_access_iterator_tag : public bidirectional_iterator_tag{};
    
此处采用继承的原因,在后面进行论述。
```c++
// 重载
template <class InputIterator, class Distance>
void __advance(InputItertor& i, Distance n, input_iterator_tag) {
  while (n--) ++i;
}

template <class ForwardIterator, class Distance>
void __advance(ForwardIterator& i, Distance n, forward_itertor_tag) {
  advance(i, n, input_iterator_tag());
}

tempalte <class BidiectionalItertor, class Distance>
void __advance(BidiectionalIterator& i, Distance n, bidirectional_iterator_tag) {
  if (n >= 0)
    while (n--) ++i;
  else
    while (n++) --i;
}

tempalte <class RandomAccessIterator, class Distance>
void __advance(RandomAccessIterator& i, Distance n, random_access_iterator_tag) {
  i += n;
}

以上代码中最后一个参数都只是声明型别,并未指定参数名称,因为它纯粹只是用来激活重载机制。

1
2
3
4
template <class InputIterator, class Distance>
void advance(InputIterator& i, Distance n) {
  __advance(i, n, iterator_traits<InputIterator>::iterator_category());   // iterator_category临时对象
}

以下代码论述说明为何要使用继承

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
using namespace std;

struct B{};
struct D1 : public B{};
struct D2 : public D1 {};
template <class I>
func (I& p, B) {
  cout << "B version" << endl;
}
template <class I>
func (I& p, D2) {
  cout << "D2 version" << endl;
}
int main() {
  int* p;
  func(p, B());  // output: B version
  func(p, D1()); // output: B version
  func(p, D2()); // output: D2 version
}

特殊问题

  1. 对半正定矩阵求行列式(determinant)时,理论上一定时非负的,但是实际计算中可能由于数值精度的问题会出现负数的情况(比如使用Eigen库来求行列式)!

  2. 再使用指针时尽量使用new来声明空间,特别是再函数内部声明空间,之后还要用到该空间的时候。因为如果不声明在堆上而声明在栈上会出现错误,因为函数返回系统会回收栈上的空间,指针指向的空间可能为乱码。

c++ 性能优化

STL

动态数组vector

在g++中采用的vector的内存分配策略是,当push_back时内存不够时会$\times 2$成倍的增加存储空间,每次增加空间都会进行整个存储空间的重新分配和数值拷贝

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <vector>
using namespace std;
int main ()
{
  vector<unsigned int> myvector;
  unsigned int capacity = myvector.capacity();

  for(unsigned int i = 0; i <  100; ++i) {
    myvector.push_back(i);
    if(capacity != myvector.capacity())
      {
        capacity = myvector.capacity();
        cout << myvector.capacity() << endl;
        cout << "address of myvector: " << &(myvector[0]) << endl;
      }
  }
  return 0;
}

由于这样动态的进行内存的申请和拷贝耗时严重所以当所需空间大小已知的情况下,一般显式的进行vector的空间分配

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <iostream>
#include <vector>
#include <chrono>
using namespace std;
using namespace std::chrono;

void func1() {
  auto start = high_resolution_clock::now();
  vector<int> v;
  for (int i = 0; i < 100000000; i++) {
    v.push_back(i);
  }

  auto end = high_resolution_clock::now();
  auto duration = duration_cast<microseconds>(end - start);
  cout << "func1 time cost: " << duration.count() << " us" << endl;
}

void func2() {

  auto start = high_resolution_clock::now();
  vector<int> v;
  v.reserve(100000000);
  for (int i = 0; i < 100000000; i++) {
    v.push_back(i);
  }

  auto end = high_resolution_clock::now();
  auto duration = duration_cast<microseconds>(end - start);
  cout << "func2 time cost: " << duration.count() << " us" << endl;
}

int main ()
{
  func1();
  func2();
  return 0;
}

=====>
func1 time cost: 1397472 ms
func2 time cost: 1154251 ms

set

在 STL 里面 insert 操作,返回值是一个序对 pair , 其中 first 是一个迭代器, second 用于表示插入操作是否发生,如果插入元素已经存在,返回 false, 反之返回 true

所以我们可以利用该性质进行查重操作。

创建池

#include <iostream>
#include <string>
#include <chrono>
using std::chrono::system_clock;

using namespace std;
#define size 1000000
class A{
public:
  A *pre, *next;
  static size_t cnt;
  A() {
    pre = nullptr;
    next = nullptr;
  }
  ~A() {
    cnt++;
  }
};
size_t A::cnt = 0;
int main() {
  {
    A *list = nullptr, *cur = nullptr;
    system_clock::time_point start = system_clock::now();
    for (size_t i = 0; i < size; i++) {
      A* tem = new A();
      if (list == nullptr) {
        list = tem;
        cur = list;
      } else {
        cur->next = tem;
        tem->pre = cur;
        cur = tem;
      }
    }
    system_clock::time_point end = system_clock::now();
    std::chrono::duration<double> elapsed_seconds = end - start;
    std::cout << "time = " << elapsed_seconds.count() << "s" << std::endl;

    start = system_clock::now();
    A* array = new A[size];
    list = &array[0];
    cur = list;
    for (size_t i = 1; i < size; i++) {
      cur->next = &array[i];
      array[i].pre = cur;
      cur = cur->next;
    }
    end = system_clock::now();
    elapsed_seconds = end - start;
    std::cout << "time = " << elapsed_seconds.count() << "s" << std::endl;
    delete[] array;
  }
  cout << "Call destroy A times: " << A::cnt << endl;
  return 0;
}

使用 new A[size] 可以减少内存 allocate 的次数,从而性能可以更好。

文件操作

  1. 创建文件夹

    1
    
    #include <sys/stat.h>
    

file_name_ = "paths/"; string dir = file_name_ + to_string(my_id); mkdir(dir.c_str(), S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);

2. 删除文件
```c++
#include <cstdio>
string file_name =file_name_ + to_string(my_id) + "/path_" + to_string(path_count_) + ".csv";
remove(file_name.c_str());
  1. 修改 stream buffer size
1
void setbuf ( FILE * stream, char * buffer );
/* setbuf example */
#include <stdio.h>

int main ()
{
  char buffer[BUFSIZ];
  FILE *pFile1, *pFile2;

  pFile1=fopen ("myfile1.txt","w");
  pFile2=fopen ("myfile2.txt","a");

  setbuf ( pFile1 , buffer );
  fputs ("This is sent to a buffered stream",pFile1);
  fflush (pFile1);

  setbuf ( pFile2 , NULL );
  fputs ("This is sent to an unbuffered stream",pFile2);

  fclose (pFile1);
  fclose (pFile2);

  return 0;
}

该函数是指定一个 buffer 给输出流使用,而不能改变 buffer 的 size 大小,使用的是默认的常数 BUFSIZ

int setvbuf ( FILE * stream, char * buffer, int mode, size_t size );

mode 由三种模式:

  • _IOFBF: full buffering
  • _IOLBF: line buffering
  • _IONBF: no buffering
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <cstdio>

#define SIZE 1024

using namespace std;

int main()
{
    char buffer[SIZE] = "...";
    char str[] = "This is first line\nThis is second line";
    FILE *fp = fopen("test.txt","wb+");

    /* no buffering, buffer remains unchanged */
    setvbuf(fp,buffer,_IONBF,SIZE);
    fwrite(str, sizeof(str), 1, fp);
    cout << buffer << endl;

    /* line buffering, only a single line is buffered */
    setvbuf(fp,buffer,_IOLBF,SIZE);
    fwrite(str, sizeof(str), 1, fp);
    cout << buffer << endl;

    /* full buffering, all the contents are buffered */
    setvbuf(fp,buffer,_IOFBF,SIZE);
    fwrite(str, sizeof(str), 1, fp);
    cout << buffer << endl;

    fclose(fp);
    return 0;
}

Output

...
This is second line
This is first line
This is second line

编程技巧

  1. 判断两个浮点数 a 和 b 是否相等时,不要用 a == b , 应该使用 fabs(a - b) < 1e-9
  2. 判断一个整数是否为奇数,使用 x % 2 != 0, 不要使用 x % 2 == 1 , 因为 x 可能为负数。若已知 n > 0 可使用 n & 0x1 判断奇偶。
  3. 使用 char 的值作为数组下标(例如,统计字符串中每个字符出现的次数),考虑到 char 可能是负数。所以应该先强制转型为 unsigned char, 在用作下标。(char 默认为 signed char 取值范围为 [-128, 127])
  4. 使用 vectorstring 由于动态分配的数组。使用 vector 表示多维数组 vector<vector<int> > array(row_num, vector<int>(col_num, 0)); 且使用 reserve 来避免不必要的重新分配。
  5. 实现unordered_map 的一对多,可以使用 unordered_map<int, vector<pair<int, int>>> cache

struct

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
struct sds{
    long len;
    long free;
    char buf[];
}

sds* sh;
sh = (sds*)malloc(sizeof(sds) + 10);
// 这样多出来的10 B 的空间都是 buf 的
// 且 sizeof(sds) 为 16,buf 并没由占空间

sh = (sds*)malloc(sizeof(long))
sh->len = 10;
// 这样编译运行不会出错,只要不访问 free 和 buf 即可

The char buf[] is a placeholder for a string. Since the max length of the string is not known at compiletime, the struct reserves the name for it, so it can be properly adressed.

When memory is allocated at runtime, the allocation must include the length of the string plus the sizeof the struct, and then can pass around the structure with the string, accessible via the array.

1
2
3
4
5
6
7
8
9
#define SAFE_FREE(p) do{free(p); p=NULL;} while(0);

#define CHECK(COND)                        \
  do {                                     \
    if (!(COND)) {                         \
      LOG_ERR("Check failure: %s", #COND); \
      exit(-1);                            \
    }                                      \
  } while (0);

这里采用 do{} while(0) 的方式是为了避免出错。 比如去掉 do{} while(0) 的话

1
2
3
4
if (p != NULL)
    SAFE_FREE(P);
else
    ...

会被展开成:

1
2
3
4
if (p != NULL)
   free(p); p = NULL;
else
    ...

如此编译便会出错。

c++ 执行终端命令

在 Linux 中可以通过 system()popen 来启动进程执行终端命令。system 在执行期间调用进程会一直等待 shell 命令执行完成(waitpid 等待子进程结束)才返回,但是 popen 无须等待 shell 命令执行完成就返回了。我们可以理解 system 为串行执行,在执行期间调用进程放弃了”控制权”,popen 为并行执行。

  • system()

  • popen(): popen() 函数 用 创建管道 的 方式 启动一个 进程, 并调用 shell. 因为 管道 是被定义成单向的, 所以 type 参数 只能 定义成 只读或者 只写

    1
    2
    3
    4
    
    #include < stdio.h >
    FILE *popen(const char *command, const char *type);
      
    int pclose(FILE *stream);
    

    example:

    1
    2
    3
    4
    5
    
    #include <stdio.h>
    ...
    FILE *fp;
    int status;
    char path[PATH_MAX];
    

    fp = popen("ls *", "r"); if (fp == NULL)

    /* Handle error */;
    

    while (fgets(path, PATH_MAX, fp) != NULL)

    printf("%s", path);
    

    status = pclose(fp); if (status == -1) {

    /* Error reported by pclose() */
    ...
    

    } else {

    /* Use macros described under wait() to inspect `status' in order
      to determine success/failure of command executed by popen() */
    ...
    

    }

`exec` 系列函数,用 exec 函数可以把当前进程替换为一个新进程,且新进程与原进程有相同的 PID 。

int execl(const char *path, const char *arg, ...); int execlp(const char *file, const char *arg, ...); int execle(const char *path, const char *arg, ..., char * const envp[]); int execv(const char *path, char *const argv[]); int execvp(const char *file, char *const argv[]);

path: 參數表示你要啟動程序的名稱包括路徑名

arg: 參數表示啟動程序所帶的參數,一般第一個參數為要執行命令名,不是帶路徑且 arg 必須以 NULL 結束

返回值:成功返回0,失敗返回-1

1. ,带 l  的 `exec` 函数: `execl,execlp,execle` ,表示后边的参数以可变参数的形式给出且都以一个空指针结束。
  ```c++
  #include <stdio.h>
  #include <stdlib.h>
  #include <unistd.h>

  int main(void)
  {
      printf("entering main process---\n");
      execl("/bin/ls","ls","-l",NULL);
      printf("exiting main process ----\n");
      return 0;
  }
  1. 带 p 的 exec函数: execlp,execvp,表示第一个参数 path 不用输入完整路径,只有给出命令名即可,它会在环境变量 PATH 当中查找命令

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
       
    int main(void)
    {
       printf("entering main process---\n");
       int ret;
       char *argv[] = {"ls","-l",NULL};
       ret = execvp("ls",argv);
       if(ret == -1)
           perror("execl error");
       printf("exiting main process ----\n");
       return 0;
    }
    

开启进程执行 python 文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
  pid_t pid;
  pid = vfork();
  if (pid == 0) {
    // child process
    std::string home_path = getenv("HOME");
    std::string path = home_path + "/.rtai/python/rtai/executor/executor.py";
    strcpy(executor_path, path.c_str());
    std::string port = std::to_string(executor_port_);
    strcpy(port_arg, port.c_str());
    int ret;
    char command[] = "python";
    char* argv[] = {command, executor_path, (char*)"--port", port_arg, NULL};
    ret = execvp("python", argv);
    if (ret == -1) {
      cout << "Create python process failed!" << endl;
    }
  } else if (pid > 0) {
      cout << "Parent process " << endl;
  }

方便的函数

  • accumulate(): 用来计算特定范围内(包括连续的部分和初始值)所有元素的和,除此之外,还可以用指定的二进制操作来计算特定范围内的元素结果。其头文件在 numeric 中。

    1
    2
    3
    4
    
    #include <numeric>
    // accumulate函数的第一个功能,求和
    int total;
    total = accumulate ( v1.begin ( ) , v1.end ( ) , 0 );
    
  • void fill(ForwardIterator first, ForwardIterator last, const T& val): 一个区间的元素都赋予val值。

    1
    2
    3
    4
    
    #include <algorithm>
    std::vector<int> myvector(8);                       // myvector: 0 0 0 0 0 0 0 0
      
    std::fill(myvector.begin(), myvector.begin() + 4, 5);   // myvector: 5 5 5 5 0 0 0 0
    
  • sort: 功能是对数组/容器中的元素进行排序.对于基础数据类型是支持默认的比较函数的,对于高级数据结构,如容器、自定义类的对象等排序需要自定义比较函数,作为第三个参数传递给sort函数。

    1
    2
    3
    4
    5
    6
    
    vector<int> v = {2, 0, 1, 5, 9, 2, 7};
    sort(v.begin(), v.end()); // 默认是升序排列, 等价于下面
    sort(v.begin(), v.end(), less<int>()); // compare 函数返回 true, 比较的两个则不用交换位置
    // 降序排序
    sort(v.rbegin(), v.rend());
    sort(v.begin(), v.end(), greater<int>());
    

    当数组的元素不是基础数据类型时,我们需要自定义比较函数。特别地,对于二维数组可以直接调用sort函数,默认是按照第一列的元素进行排序的。

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    
    class A {
    public:
        int a1, a2;
        A(int m, int n): a1(m), a2(n) {}
    };
      
    class B {
    public:
        int b1, b2;
        B(int m, int n): b1(m), b2(n) {}
    };
      
    bool cmp1(A const *a, A const *b) {
        return a->a1 < b->a1;
    }
      
    bool cmp2(B const &a, B const &b) {
        return a.b1 < b.b1;
    }
      
    void printArray(vector<A*> array) {
        for (int i = 0; i < array.size(); ++i) {
            cout << array[i]->a1 << " " << array[i]->a2 << endl;
        }
        cout << endl;
    }
      
    void printArray2(vector<B> array) {
        for (int i = 0; i < array.size(); ++i) {
            cout << array[i].b1 << " " << array[i].b2 << endl;
        }
        cout << endl;
    }
      
    int main() {
      
        vector<A*> array;
        array.push_back(new A(65, 100));
        array.push_back(new A(70, 150));
        array.push_back(new A(56, 90));
        array.push_back(new A(75, 190));
        array.push_back(new A(60, 95));
        array.push_back(new A(68, 110));
      
        printArray(array);
        sort(array.begin(), array.end(), cmp1);
        printArray(array);
      
        vector<B> array2;
        array2.push_back(B(65, 100));
        array2.push_back(B(70, 150));
        array2.push_back(B(56, 90));
        array2.push_back(B(75, 190));
        array2.push_back(B(60, 95));
        array2.push_back(B(68, 110));
      
        printArray2(array2);
        sort(array2.begin(), array2.end(), cmp2);
        printArray2(array2);
      
        return 0;
    }
    
  • iterator unique(iterator it_1,iterator it_2);: 功能是元素去重。即”删除”序列中所有相邻的重复元素(只保留一个)。此处的删除,并不是真的删除,而是指重复元素的位置被不重复的元素给占领了(详细情况,下面会讲)。由于它”删除”的是相邻的重复元素,所以在使用unique函数之前,一般都会将目标序列进行排序

    unique函数的去重过程实际上就是不停的把后面不重复的元素移到前面来,也可以说是用不重复的元素占领重复元素的位置

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
    #include<iostream>
    #include<algorithm>
    #include<cassert>
    using namespace std;
      
    int main()
    {
      
        vector<int> a ={1,3,3,4,5,6,6,7};
        vector<int>::iterator it_1 = a.begin();
        vector<int>::iterator it_2 = a.end();
        vector<int>::iterator new_end;
      
        new_end = unique(it_1,it_2); //注意unique的返回值
        a.erase(new_end,it_2);
        cout<<"删除重复元素后的 a : ";
        for(int i = 0 ; i < a.size(); i++)
            cout<<a[i];
        cout<<endl;
      
    }
    
  • getenv 获取 home 的地址

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    
    #include <stdio.h>
    #include <stdlib.h>
      
    int main ()
    {
      printf("PATH : %s\n", getenv("PATH"));
      printf("HOME : %s\n", getenv("HOME"));
      printf("ROOT : %s\n", getenv("ROOT"));
      
      return(0);
    }
    
  • 在 c++17 引入了 fielsystem 库, 其中包含大量关于文件的操作函数等。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <filesystem>
namespace fs = std::filesystem;
int main()
{
    fs::current_path(fs::temp_directory_path());
    fs::create_directories("sandbox/1/2/a");
    fs::create_directory("sandbox/1/2/b");
    fs::permissions("sandbox/1/2/b", fs::perms::others_all, fs::perm_options::remove);
    fs::create_directory("sandbox/1/2/c", "sandbox/1/2/b");
    std::system("ls -l sandbox/1/2");
    std::system("tree sandbox");
    fs::remove_all("sandbox");
}

anonymous or unnamed namespaces

You can create an explicit namespace but not give it a name:

1
2
3
4
namespace
{
  int MyFunc(){}
}

This is called an unnamed or anonymous namespace and it is useful when you want to make variable declarations invisible to code in other files (i.e. give them internal linkage) without having to create a named namespace. All code in the same file can see the identifiers in an unnamed namespace but the identifiers, along with the namespace itself, are not visible outside that file—or more precisely outside the translation unit.

This is used in implementation file (cpp/c file).

编译

  • c++ 头文件默认搜索的库路径 CPLUS_INCLUDE_PATH #include “headfile.h” 的搜索顺序为:

    • 先搜索当前目录
    • 然后搜索 -I 指定的目录
    • 再搜索 gcc 的环境变量 CPLUS_INCLUDE_PATH ( C 程序使用的是 C_INCLUDE_PATH )
    • 最后搜索 gcc 的内定目录

    #include <headfile.h> 的搜索顺序为:

    • 先搜索 -I 指定的目录
    • 然后搜索 gcc 的环境变量 CPLUS_INCLUDE_PATH
    • 最后搜索 gcc 的内定目录

    注意,#include<> 方式不会搜索当前目录!

  • c++ 库文件, 编译的时候查找顺序:

    • 先找 -L
    • 再找环境变量 LIBRARY_PATH
    • 再找内定目录 /lib /usr/lib /usr/local/lib
  • 运行时动态库的搜索路径:

    • 编译目标代码时指定的动态库搜索路径(这是通过 gcc 的参数"-Wl,-rpath,"指定。当指定多个动态库搜索路径时,路径之间用冒号":"分隔)
    • 环境变量 LD_LIBRARY_PATH 指定的动态库搜索路径(当通过该环境变量指定多个动态库搜索路径时,路径之间用冒号":"分隔)
    • 配置文件 /etc/ld.so.conf 中指定的动态库搜索路径;
    • 默认的动态库搜索路径 /lib;
    • 默认的动态库搜索路径 /usr/lib。

    应注意动态库搜寻路径并不包括当前文件夹

使用 -l 参数时,由于编译是从左往右扫描,当发现不能识别的符号就会往后找,所以一般把 -l lib 放到末尾出,不然可能出现 undefine 的错误。

Code example

get ip

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <stdio.h>
#include <sys/types.h>
#include <ifaddrs.h>
#include <netinet/in.h>
#include <string.h>
#include <arpa/inet.h>

int main (int argc, const char * argv[]) {
    struct ifaddrs * ifAddrStruct=NULL;
    struct ifaddrs * ifa=NULL;
    void * tmpAddrPtr=NULL;

    getifaddrs(&ifAddrStruct);

    for (ifa = ifAddrStruct; ifa != NULL; ifa = ifa->ifa_next) {
        if (!ifa->ifa_addr) {
            continue;
        }
        if (ifa->ifa_addr->sa_family == AF_INET) { // check it is IP4
            // is a valid IP4 Address
            tmpAddrPtr=&((struct sockaddr_in *)ifa->ifa_addr)->sin_addr;
            char addressBuffer[INET_ADDRSTRLEN];
            inet_ntop(AF_INET, tmpAddrPtr, addressBuffer, INET_ADDRSTRLEN);
            printf("%s IP4 Address %s\n", ifa->ifa_name, addressBuffer);
        } else if (ifa->ifa_addr->sa_family == AF_INET6) { // check it is IP6
            // is a valid IP6 Address
            tmpAddrPtr=&((struct sockaddr_in6 *)ifa->ifa_addr)->sin6_addr;
            char addressBuffer[INET6_ADDRSTRLEN];
            inet_ntop(AF_INET6, tmpAddrPtr, addressBuffer, INET6_ADDRSTRLEN);
            printf("%s IP6 Address %s\n", ifa->ifa_name, addressBuffer);
        }
    }
    if (ifAddrStruct!=NULL) freeifaddrs(ifAddrStruct);
    return 0;
}

measure time

1
2
3
4
5
6
7
8
  #include <chrono>
  using std::chrono::system_clock;

  system_clock::time_point start = system_clock::now();
  worker();
  system_clock::time_point end = system_clock::now();
  std::chrono::duration<double> elapsed_seconds = end - start;
  std::cout << "time = " << elapsed_seconds.count() << "s" << std::endl;

parallel memcopy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
void parallel_memcopy(uint8_t *dst, const uint8_t *src, int64_t nbytes,
                      uintptr_t block_size, int num_threads) {
  std::vector<std::thread> threadpool(num_threads);
  uint8_t *left = pointer_logical_and(src + block_size - 1, ~(block_size - 1));
  uint8_t *right = pointer_logical_and(src + nbytes, ~(block_size - 1));
  int64_t num_blocks = (right - left) / block_size;

  // Update right address
  right = right - (num_blocks % num_threads) * block_size;

  // Now we divide these blocks between available threads. The remainder is
  // handled on the main thread.
  int64_t chunk_size = (right - left) / num_threads;
  int64_t prefix = left - src;
  int64_t suffix = src + nbytes - right;
  // Now the data layout is | prefix | k * num_threads * block_size | suffix |.
  // We have chunk_size = k * block_size, therefore the data layout is
  // | prefix | num_threads * chunk_size | suffix |.
  // Each thread gets a "chunk" of k blocks.

  // Start all threads first and handle leftovers while threads run.
  for (int i = 0; i < num_threads; i++) {
    threadpool[i] = std::thread(std::memcpy, dst + prefix + i * chunk_size,
                                left + i * chunk_size, chunk_size);
  }

  std::memcpy(dst, src, prefix);
  std::memcpy(dst + prefix + num_threads * chunk_size, right, suffix);

  for (auto &t : threadpool) {
    if (t.joinable()) {
      t.join();
    }
  }
}

write to file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/**
 * Write a sequence of bytes into a file descriptor. This will block until one
 * of the following happens: (1) there is an error (2) end of file, or (3) all
 * length bytes have been written.
 *
 * @param fd The file descriptor to write to. It can be non-blocking.
 * @param cursor The cursor pointing to the beginning of the bytes to send.
 * @param length The size of the bytes sequence to write.
 * @return int Whether there was an error while writing. 0 corresponds to
 *         success and -1 corresponds to an error (errno will be set).
 */
int write_bytes(int fd, uint8_t *cursor, size_t length) {
  ssize_t nbytes = 0;
  while (length > 0) {
    /* While we haven't written the whole message, write to the file
     * descriptor, advance the cursor, and decrease the amount left to write. */
    nbytes = write(fd, cursor, length);
    if (nbytes < 0) {
      if (errno == EAGAIN || errno == EWOULDBLOCK) {
        continue;
      }
      /* TODO(swang): Return the error instead of exiting. */
      /* Force an exit if there was any other type of error. */
      CHECK(nbytes < 0);
    }
    if (nbytes == 0) {
      return -1;
    }
    cursor += nbytes;
    length -= nbytes;
  }
  return 0;
}

std::thread

class job

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <thread>
#include <vector>

using namespace std;

class Work{
public:
  Work() {}
  void operator()() {
    cout << "worker is done" << endl;
  }
};
int main(int argc, char *argv[]) {
  vector<thread> m_thread(10);
  m_thread[0] = thread(Work());
  m_thread[0].join();
  return 0;
}

需要注意的是在调用 thread(Work()) 的时候,会对 Work 对象进行一次拷贝!

condition variable 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <iostream>
#include <string>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <unistd.h>

std::mutex m;
std::condition_variable cv;
std::string data;
bool ready = false;
bool processed = false;

void worker_thread()
{
  // Wait until main() sends data
  std::unique_lock<std::mutex> lk(m);
  cv.wait(lk, []{return ready;});

  // after the wait, we own the lock.
  std::cout << "Worker thread is processing data\n";
  data += " after processing";

  // Send data back to main()
  processed = true;
  std::cout << "Worker thread signals data processing completed\n";

  // Manual unlocking is done before notifying, to avoid waking up
  // the waiting thread only to block again (see notify_one for details)
  lk.unlock();
  cv.notify_one();
}

int main()
{
  std::thread worker(worker_thread);

  data = "Example data";
  // send data to the worker thread
  {
    std::lock_guard<std::mutex> lk(m);
    ready = true;
    std::cout << "main() signals data ready for processing\n";
  }
  cv.notify_one();

  // wait for the worker
  {
    std::unique_lock<std::mutex> lk(m);
    cv.wait(lk, []{return processed;});
  }
  std::cout << "Back in main(), data = " << data << '\n';

  worker.join();
}

以上样例,不会死锁是因为 std::condition_variable::wait

std::condition_variable::wait 的说明如下:

unconditional (1)
void wait (unique_lock<mutex>& lck);
predicate (2)
template <class Predicate>
  void wait (unique_lock<mutex>& lck, Predicate pred);

Wait until notified The execution of the current thread (which shall have locked lck's mutex) is blocked until notified.

At the moment of blocking the thread, the function automatically calls lck.unlock(), allowing other locked threads to continue.

Once notified (explicitly, by some other thread), the function unblocks and calls lck.lock(), leaving lck in the same state as when the function was called. Then the function returns (notice that this last mutex locking may block again the thread before returning).

Generally, the function is notified to wake up by a call in another thread either to member notify_one or to member notify_all. But certain implementations may produce spurious wake-up calls without any of these functions being called. Therefore, users of this function shall ensure their condition for resumption is met.

If pred is specified (2), the function only blocks if pred returns false, and notifications can only unblock the thread when it becomes true (which is specially useful to check against spurious wake-up calls). This version (2) behaves as if implemented as:

while (!pred()) wait(lck);

condition variable 2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <iostream>
#include <string>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <unistd.h>

std::mutex m;
std::condition_variable cv;

void worker_thread(int i)
{
  std::unique_lock<std::mutex> lk(m);
  cv.wait(lk);

  // after the wait, we own the lock.
  std::cout << i << " Worker thread is processing data\n";
}

int main()
{
  std::thread worker0(worker_thread, 0);
  std::thread worker1(worker_thread, 1);
  std::thread worker2(worker_thread, 2);
  sleep(3);

  cv.notify_one();
  std::cout << "notify one done!" << std::endl;
  cv.notify_one();
  std::cout << "notify one done!" << std::endl;
  cv.notify_one();
  std::cout << "notify one done!" << std::endl;
  worker0.join();
  worker1.join();
  worker2.join();
}

需要注意如果上示例中没有 sleep(3) 的话会出现死锁。这是因为 notify_one 是非阻塞的。会出现有的 thread 还没有执行到 wait 那一步,三个 notify_one 就执行完了,导致后面 thread阻塞没有 notify.

pass by reference

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <iostream>
#include <utility>
#include <thread>
#include <chrono>

void f1(int n)
{
    for (int i = 0; i < 5; ++i) {
        std::cout << "Thread 1 executing\n";
        ++n;
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
    }
}

void f2(int& n)
{
    for (int i = 0; i < 5; ++i) {
        std::cout << "Thread 2 executing\n";
        ++n;
        std::this_thread::sleep_for(std::chrono::milliseconds(10));
    }
}

class foo
{
public:
    void bar()
    {
        for (int i = 0; i < 5; ++i) {
            std::cout << "Thread 3 executing\n";
            ++n;
            std::this_thread::sleep_for(std::chrono::milliseconds(10));
        }
    }
    int n = 0;
};

class baz
{
public:
    void operator()()
    {
        for (int i = 0; i < 5; ++i) {
            std::cout << "Thread 4 executing\n";
            ++n;
            std::this_thread::sleep_for(std::chrono::milliseconds(10));
        }
    }
    int n = 0;
};

int main()
{
    int n = 0;
    foo f;
    baz b;
    std::thread t1; // t1 is not a thread
    std::thread t2(f1, n + 1); // pass by value
    std::thread t3(f2, std::ref(n)); // pass by reference
    std::thread t4(std::move(t3)); // t4 is now running f2(). t3 is no longer a thread
    std::thread t5(&foo::bar, &f); // t5 runs foo::bar() on object f
    std::thread t6(b); // t6 runs baz::operator() on a copy of object b
    t2.join();
    t4.join();
    t5.join();
    t6.join();
    std::cout << "Final value of n is " << n << '\n';
    std::cout << "Final value of f.n (foo::n) is " << f.n << '\n';
    std::cout << "Final value of b.n (baz::n) is " << b.n << '\n';
}

The arguments to the thread function are moved or copied by value. If a reference argument needs to be passed to the thread function, it has to be wrapped (e.g., with std::ref or std::cref).

Any return value from the function is ignored. If the function throws an exception, std::terminate is called. In order to pass return values or exceptions back to the calling thread, std::promise or std::async may be used.

future

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include<iostream>  //std::cout std::endl
#include<thread>   //std::thread
#include<future>   //std::future std::promise
#include<utility>   //std::ref
#include<chrono>   //std::chrono::seconds

void initiazer(std::promise<int> &promiseObj){
  std::cout << "Inside thread: " << std::this_thread::get_id() << std::endl;
  std::this_thread::sleep_for(std::chrono::seconds(1));
  promiseObj.set_value(35);
}

int main(){
  std::promise<int> promiseObj;
  std::future<int> futureObj = promiseObj.get_future();
  std::thread th(initiazer, std::ref(promiseObj));

  std::cout << futureObj.get() << std::endl;

  th.join();
  return 0;
}

double 相等判断不可用 ==

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <iostream>

using namespace std;

int main()
{
  double a = 4;
  double b = 0.1;
  double c = a - b;
  cout << "c = " << c << endl;
  double d = 3.1 + 0.8;
  cout << "d = " << d << endl;
  if (c == d)
    cout << d << endl;
  return 0;
}

以上代码,判断 c == d 并不会成功,应该使用如下判断

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <iostream>
#define Epslion 1e-8
using namespace std;

int main()
{
  double a = 4;
  double b = 0.1;
  double c = a - b;
  cout << "c = " << c << endl;
  double d = 3.1 + 0.8;
  cout << "d = " << d << endl;
  if (abs(c - d) < Epslion)
    cout << d << endl;
  return 0;
}

linux

/dev/urandom

随机数生成器的结构

次熵池: /dev/random 设备关连的,大小为128字节,它是阻塞的

urandom 熵池: 和 /dev/urandom 设备关连的,大小为 128 字节,它是非阻塞的

计数器: 主熵池 、次熵池 以及 urandom熵池各自都有一个计数器,用一个整数值来记录,表示当前熵池中可用随机数的数量,这是一个预估的值,它是生成器根据熵池中的环境数据估算出来的

输出接口

生成器主要有 /dev/random/dev/urandomget_random_bytes() 这三个接口

/dev/random、/dev/urandom

可以从用户空间去访问这两设备文件,即使是普通用户也有访问权限,它们返回指定请求数量的随机数

get_random_bytes()

只供内核使用的接口, 返回指定请求数量的随机数,暂时不讨论这个接口

dev/random、/dev/urandom 的区别

  • /dev/urandom 它返回指定请求数量的随机数,如果请求的数量非常庞大的话,返回的随机数可能是伪随机数,随机数质量稍差些,即使如此,它们对大多数应用来说已经足够了

  • /dev/random 也是返回指定请求数量的随机数,但是它产生的随机数质量很高, 是属于真随机数, 主要用于需要高质量的随机数的地方,比如:生成加密密钥等。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <iostream>
#include <cstdio>

using namespace std;

int main() {
  FILE* fp = fopen("/dev/urandom", "r");
  unsigned char seed[64];
  fread(seed, sizeof(seed), 1, fp);
  cout << (long)seed << endl;
  return 0;
}

命令行参数解析

getopt() 函数

1
2
3
int getopt(int argc,char * const argv[ ],const char * optstring);
extern char *optarg;
extern int optind, opterr, optopt;

其中argc,argv为main函数的参数,shortopts为选项字符串,依据提供的选项执行不同的功能.

getopt() 所设置的全局变量包括:

  • extern char *optarg: 选项对应的参数, 如-t 12即选项-t对应的参数为12
  • extern int optind: 记录下一个检索的索引
  • extern int opterr: 该值非0则为非法选项和缺少参数项输出错误信息,该值默认为1.
  • extern int optopt:记录非法选项

getopt()用来分析命令行参数。参数argc和argv分别代表参数个数和内容,跟main()函数的命令行参数是一样的。

以一个具体的例子讲解shortopts的格式以及用法"abc:d::e"

无冒号: 选项不带参数,如a或者b 单冒号: 表示冒号前的选项必须带参数,如-c 2或者-c2表示的都是选项c的参数为2 双冒号: 表示选项的参数可带可不带,若带参数,选项与参数之间无空格,否则参数无法读取

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdio.h>
#include <getopt.h>

int main(int argc, char*argv[]) {
    int opt;
    while((opt = getopt(argc,argv,"abc:d::e")) != -1){
        switch(opt){
            case 'a':
                printf("option a:%s\n",optarg);
                break;
            case 'b':
                printf("option b:%s\n",optarg);
                break;
            case 'c':
                printf("option c:%s\n",optarg);
                break;
            case 'd':
                printf("option d:%s\n",optarg);
                break;
            case 'e':
                printf("option e:%s\n",optarg);
                break;
            default:
                printf("option error\n");
        }
    }
    return 0;
}
1
2
3
#include <unistd.h>
int unlink(const char* pathname);
// 成功返回 0 , 失败返回 -1

unlink() 函数为删除参数指定的文件。需要注意的是,执行 unlink 函数并不一定会真正的删除文件,它先检查文件系统中对该文件的连接数是否为 1, 如果不是 1 说明还有此文件还有其它连接对象,因此只对文件的连接数进行减1 操作, 直到所有打开该文件的进程都结束时文件就会被删除。

LD_PRELOAD

Linux操作系统的动态链接库在加载过程中,动态链接器会先读取LD_PRELOAD环境变量和默认配置文件/etc/ld.so.preload,并将读取到的动态链接库文件进行预加载。即使程序不依赖这些动态链接库,LD_PRELOAD环境变量和/etc/ld.so.preload配置文件中指定的动态链接库依然会被加载,因为它们的优先级比LD_LIBRARY_PATH环境变量所定义的链接库查找路径的文件优先级要高,所以能够提前于用户调用的动态库载入。

正常的用户代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// rand.c
#include <stdio.h>
#include <time.h>
#include <stdlib.h>


int main(int argc, char *argv[])
{
    int i;

    srand(time(NULL));
    for(i=0;i<10;i++){
        printf("%d\n",rand()%100);
    }
    return 0;
}

用于劫持的代码

1
2
3
4
5
6
7
// myrand.c
#include<stdio.h>

int rand()
{
    return 55;
}

将 myrand.c 程序编译为 libmyrand.so 文件

1
gcc -o libmyrand.so -shared -fPIC myrand.c
  • shared是生成共享库格式

  • fPIC选项作用于编译阶段,告诉编译器产生与位置无关代码(Position-Independent Code);这样一来,产生的代码中就没有绝对地址了,全部使用相对地址,所以代码可以被加载器加载到内存的任意位置,都可以正确的执行。这正是共享库所要求的,共享库被加载时,在内存的位置不是固定的。

使用LD_PRELOAD 替换 glibc 中的 rand

1
LD_PRELOAD=$PWD/libmyrand.so ./rund

劫持 linux 终端命令 whoami

inux 中 whoami 是会调用底层的 puts 方法。

编写劫持代码:

#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
#include <stdlib.h>


int puts(const char *message) {
    int (*new_puts)(const char *message);
    int result;
    new_puts = dlsym(RTLD_NEXT, "puts");
    printf("this is id:%d\n",getuid());     //获取他的uid并输出
    result = new_puts(message);
    return result;
}

编译为so文件:

1
gcc who.c -o who.so -fPIC -shared -ldl -D_GNU_SOURCE
  • ldl 显示方式加载动态库,可能会调用dlopen、dlsym、dlclose、dlerror
  • D_GNU_SOURCE 以GNU规范标准编译,如果不加上这个参数会报RTLD_NEXT未定义的错误

扩展

可知:

1
2
3
4
5
6
__attribute__((constructor))
    constructor参数让系统执行main()函数之前调用函数(__attribute__((constructor))修饰的函数)


__attribute__((destructor))
    destructor参数让系统在main()函数退出或者调用了exit()之后,(__attribute__((destructor))修饰的函数)

比如:

1
2
3
4
5
6
7
// hijack.c
#include<stdio.h>
#include<stdlib.h>

__attribute__((constructor)) void jxk() {
    system("ls");
}

然后编译和设置环境变量:

1
2
# gcc -o libhijack.so -shared -fPIC hijack.c
# export LD_PRELOAD=$PWD/libhijack.so
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// helloworld.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    printf("hello, world\n");

    return 0x0;
}

编译劫持

1
2
3
# gcc -o helloworld helloworld.c
# export LD_PRELOAD=$PWD/libhijack.so
# ./helloworld

执行后,发现陷入了死循环。这是因为一直在调用所劫持的函数,所以修改一下(hijack.c):

1
2
3
4
5
6
7
8
#include <stdlib.h>

__attribute__((constructor)) void jxk() {
    if(getenv("LD_PRELOAD") == NULL) return;      //getenv是获取环境变量 当他为空时我们已经不需要执行了

    unsetenv("LD_PRELOAD");                       //unsetenv删除环境变量的函数 调用一次就可以直接删除了
    system("ls");
}

使用 ldd 可以查看程序依赖的库

1
2
3
4
ldd a.out
        linux-vdso.so.1 (0x00007fffff6fe000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc3fccea000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc3fcf00000)

使用 readelf 查看可执行文件会调用的函数表:

1
readelf -Ws a.out

如何实现在劫持函数内在调用回原库函数呢? 使用 dlopendlsym

dlopen()

The function dlopen() loads the dynamic library file named by the null-terminated string filename and returns an opaque "handle" for the dynamic library. If filename is NULL, then the returned handle is for the main program. If filename contains a slash ("/"), then it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic linker searches for the library

dlsym()

The function dlsym() takes a "handle" of a dynamic library returned by dlopen() and the null-terminated symbol name, returning the address where that symbol is loaded into memory. If the symbol is not found, in the specified library or any of the libraries that were automatically loaded by dlopen() when that library was loaded, dlsym() returns NULL.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include<stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

typedef int (*Rand)(void);

int rand()
{
  printf("in my rand");
  void* handle = dlopen("libc.so.6", RTLD_LAZY);
  if (handle == NULL) {
    exit(0);
  }
  return ((Rand)dlsym(handle, "rand"))();
}

可能会碰到出现如下错误

1
2
3
LD_PRELOAD=$PWD/libmyrand.so ./a.out
./a.out: symbol lookup error: /home/liudy/Seafile/WorkSpace/CodeExample/ld_prel
oad/demo/libmyrand.so: undefined symbol: dlopen

这是因为在编译 .so 需要加上 -ldl.

1
gcc -o libmyrand.so -shared -fPIC -ldl myrand.c

struct iovec

iovec 结构体的字段 iov_base 指向一个缓冲区,这个缓冲区存放的是网络接收的数据(read),或者网络将要发送的数据(write)。iovec 结构体的字段 iov_len 存放的是接收数据的最大长度(read),或者实际写入的数据长度(write)。

1
2
3
4
5
6
7
struct iovec {
    /* Starting address (内存起始地址)*/
    void  *iov_base;

    /* Number of bytes to transfer(这块内存长度) */
    size_t iov_len;
};

在 linux 中,使用这样的结构体变量作为参数的函数, eg.

1
2
3
4
5
6
7
#include <sys/uio.h>
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);
ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset);
ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt,off_t offset);
ssize_t preadv2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags);
ssize_t pwritev2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags);

struct msghdr

Linux msghdr这一重要的数据结构,其广泛应用于如文件描述符传递,数字证书传递等方面. msghdr结构一般会用于如下两个函数中:

1
2
3
4
5
#include <sys/types.h>
#include <sys/socket.h>

ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);

它主要用于向一个socket发送消息,或从一个socket中接收消息。此处很重要的一个作用就是用在unix域中传递一个文件描述符。struct msghdr结构如下:

1
2
3
4
5
6
7
8
9
struct msghdr {
   void         *msg_name;       /* optional address */
   socklen_t     msg_namelen;    /* size of address */
   struct iovec *msg_iov;        /* scatter/gather array */
   size_t        msg_iovlen;     /* # elements in msg_iov */
   void         *msg_control;    /* ancillary data, see below */
   size_t        msg_controllen; /* ancillary data buffer len */
   int           msg_flags;      /* flags on received message */
};

errno

在 linux 中使用 c 语言编程时,errno可以把最后一次调用 c 的方法的错误代码保留 但是如果最后一次成功的调用 c 的方法, errno 不会改变。因此,只有在 c 语言函数返回值异常时,再检测 errno。 errno 会返回一个数字,每个数字代表一个错误类型。详细的可以查看头文件。 /usr/include/asm/errno.h

将错误代码转换为字符串错误信息

fprintf(stderr,"error in CreateProcess %s, Process ID %d ",strerror(errno),processID)

使用perror函数

void perror(const char *s)

perror() 用来将上一个函数发生错误的原因输出到标准错误(stderr),参数 s 所指的字符串会先打印出,后面再加上错误原因字符串。

开启新进程

fork

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
  #include <stdio.h>
  #include <unistd.h>


  int main()
  {
    int tmp = 5;
    pid_t res = fork();
    if(res < 0){
     //fork失败
     perror("fork");
   }else if(res == 0){
     //该进程为子进程
     printf("im child[%d],fasther is %d,tmp is %d.\n",getpid(),getppid(),tmp++);
   }else{
     //该进程为父进程
     printf("im father[%d],tmp is %d.\n",getpid(),tmp++);
   }
   printf("tmp = %d\n",tmp);
   return 0;
 }

返回值大于0则当前进程为父进程,等于0代表为子进程,小于零代表创建子进程失败。 子进程拷贝父进程的数据段,代码段,父子进程的执行次序不确定. fork 采用了 copy-on-write 的策略。

vfork

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#include <stdio.h>
#include <unistd.h>

  int tmp = 3;

  int main()
  {
    pid_t res = vfork();
    if(res < 0){
     perror("vfork");
     _exit();
   }else if(res == 0){
    tmp = 10;
 printf("child res = %d\n",tmp);
     _exit(0);
   }else{
     printf("father res = %d\n",tmp);
   }

   return 0;
 }

vfork 保证子进程先运行,在她调用exec 或exit 之后父进程才可能被调度运行。 子进程直接共用父进程的页表,改变子进程的数据也会影响到父进程。

vfork 一般用于创建子进程后,立刻执行 exec 。

Reference

  1. thread pool github library
  2. C++线程池
  3. multithreaded blog
  4. LD_PRELOAD基础用法
  5. libc manual
  6. dlsym
updatedupdated2022-10-202022-10-20