plafle's blog

plafle's blog

the pursuit of rationalism

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Quick Start

Create a new post

1
$ hexo new "My New Post"

More info: Writing

Run server

1
$ hexo server

More info: Server

Generate static files

1
$ hexo generate

More info: Generating

Deploy to remote sites

1
$ hexo deploy

More info: Deployment

概率论

1. 泊松分布

\(\hspace{1cm}\) 定义为\(P(x=k) = \frac{\lambda ^k}{k!}e^{-\lambda}\)\(\lambda > 0\),也可以记作\(X\) ~ \(P(\lambda)\)

\(\hspace{1cm}\) 对于一个二项分布\(B(n,p)\),如果\(n\)比较大,\(np\)适中(\(n \ge 100, np \le 10\))那么可以把二项分布\(B(n,p)\)近似为\(P(np)\)

2. 超几何分布

\(\hspace{1cm}\) 定义为\(H(N,M,n)\),表示从有限\(N\)个物品中(其中包含\(M\)个指定种类的物件)中抽出\(n\)个物件,成功抽出该指定种类的物件的次数(不放回)

3. 均匀分布

\(\hspace{1cm}\) 即在\([a,b]\)范围内等概率分布的情况,记作\(X\) ~ \(U(a,b)\)

4. 指数分布

\(\hspace{1cm}\) 其概率密度函数为 \[ f(x) = \begin{cases}\lambda e^{-\lambda x}\ \ \ \ x > 0\\ 0 \ \ \ \ \ \ \ \ \ \ \ \ x \le 0 \end{cases} \] \(\hspace{1cm}\) 记作\(X\) ~ \(Exp(\lambda)\)

5. 几何分布

\(\hspace{1cm}\) 定义为\(P(X = k) = (1 - p)^{k - 1} \times p\),也可记作\(X\) ~ \(G(p)\),表示第\(k\)次刚好第一次发生概率为\(p\)的事

6. 正态分布

\(\hspace{1cm}\) 其密度函数记为 \[ \varphi(x) = \frac 1{\sqrt{2\pi} \sigma}e^{-\frac{(x - \mu)^2}{2\sigma^2}} \] \(\hspace{1cm}\) 记作\(X\) ~ \(N(\mu, \sigma^2)\)

\(\hspace{1cm}\) 其分布函数记为\(\Phi(x) = \int_{-\infty}^x \phi(t) dt\)

\(\hspace{1cm}\)\(\mu = 0, \sigma = 1\)时,认为是标准正态分布,其概率密度函数记为\(\varphi_0(x)\),分布函数记为\(\Phi_0(x)\)

\(\hspace{1cm}\) 对于任何任意一个\(\varphi(x)\),有\(\varphi(x) = \frac 1 \sigma \varphi_0(\frac {x - \mu}{\sigma})\),以及\(\Phi(x) = \Phi_0(\frac{x - \mu}{\sigma})\)

多维随机变量相关

1.多维随机变量的联合分布函数,以及边缘分布函数(对于离散型和连续型均适用)

\(\hspace{1cm}\) 仿照单元模式,我们有如下定义:\(F(x,y) = P(X \le x, Y \le y)\)\((X,Y)\)的联合分布函数

\(\hspace{1cm}\) 根据定义,容易证明

\[ P(x_1 < X \le x_2, y_1 < Y \le y_2) = F(x_2,y_2) - F(x_1,y_2) - F(x_2,y_1) + F(x_1,y_1) \]

\(\hspace{1cm}\) 注意到\(F(x,y)\)有如下性质:

\[ F(-\infty,y) = F(x, -\infty) = 0 \]

\[ F(-\infty,-\infty) = 0, F(+\infty,+\infty) = 1 \]

\(\hspace{1cm}\) 定义边缘分布函数:称\(F_X(x),F_Y(y)\)分别为\((X,Y)\)关于\(X\)\(Y\)的边缘分布函数,其中\(F_X(x) = F(x,+\infty)\)\(F_Y(y) = F(y, +\infty)\)

2. 多维随机变量的联合概率密度以及边缘概率密度(连续型)

\(\hspace{1cm}\) 若存在二元函数\(f(x,y)\)使得\(\int_{-\infty}^x \int_{-\infty}^y f(u,v)dudv = F(x,y)\),则称\(f(x,y)\)为联合概率密度

\(\hspace{1cm}\) 定义边缘概率密度函数:称\(f_X(x), f_Y(y)\)分别为\((X,Y)\)关于\(X\)\(Y\)的边缘密度函数,其中\(f_X(x) = \int_{-\infty}^{+\infty}f(x,y)dy\), \(f_Y(y) = \int_{-\infty}^{+\infty}f(x,y)dx\)

3. 多维随机变量的联合分布列以及边缘分布列(离散型)

\(\hspace{1cm}\)\((X,Y)\)均是离散的,那么定义\(p_{ij} =P(X = x_i, Y = y_i)\)为联合分布列

\(\hspace{1cm}\) 同时定义\(p_{i·} = \sum_j p_{ij}\)\((X,Y)\)关于\(X\)的边缘分布列,\(p_{·j} =\sum _i p_{ij}\)\((X,Y)\)关于\(Y\)的边缘分布列

4. 多维变量的分布

\(\hspace{1cm}\)\(Z = \frac XY\),且\(f(x,y)\)表示变量\(X,Y\)的联合概率密度函数,则有: \[ f_Z(z) = \int_{-\infty}^{+\infty} |y| f(yz,y)dy \]

期望,方差,协方差

  • 常见分布的期望和方差:

    类型 E:期望 D:方差
    \(U(a,b)\) \(\frac {a + b}2\) \(\frac{(b - a)^2}{12}\)
    \(P(\lambda)\) \(\lambda\) \(\lambda\)
    \(E(\lambda)\) \(\frac 1\lambda\) \(\frac 1{\lambda^2}\)
    \(G(p)\) \(\frac 1p\) \(\frac q{p^2}\)
    \(N(\mu, \sigma^2)\) \(\mu\) \(\sigma^2\)
    \(B(n,p)\) \(np\) \(npq\)

    注:\(E(\lambda)\)为指数分布,\(G(p)\)为几何分布,\(q = 1 - p\)

  • 切比雪夫不等式: 对于一切分布\(X\),有: \[ P(|X - EX| > \varepsilon) \le \frac {DX}{\varepsilon^2} \]

  • 注意:如果随机变量\(X,Y\)满足\(E(XY) = E(X)E(Y)\)那么不能说明\(X\)\(Y\)是独立的!

  • 对于任意随机变量\(X_1,X_2,...,X_n\)以及任意常数\(k_0,k_1,...,k_n\),我们有: \[ D(k_0 + k_1X_1+ k_2X_2 + ... + k_nX_n) = \sum_{i = 1}^n{k_i}^2DX_i + 2 \sum_{1 \le i < j \le n}k_ik_jCov(X_i,X_j) \]

  • Cauchy-schwarz inequation: \[ EX^2 \times EY^2 \ge (E[XY])^2 \]

  • 最小二乘法拟合:对于随机变量\(X,Y\),尝试用\(y = \hat{a}x + \hat{b}\)拟合,那么最好的拟合直线中: \[ \hat{a} = \rho_{XY} \sqrt{\frac {DY}{DX} }, \hat{b} = EY - \hat{a}EX \]

  • \((X,Y)\)服从二维正态分布\(N(\mu_1,\mu_2,{\sigma_1}^2,{\sigma_2}^2,\rho)\),那么\(Cov(X,Y) = \sigma_1\sigma_2\rho\)

  • 条件期望的概念:书p69开始,注意:\(E(X|Y)\)是一个关于\(Y\)的函数,也是一个随机变量,而\(E(X|Y = y)\)是一个常数。全期望公式: \[ E(X) = E[E(X|Y)] \]

数理统计部分

  • 样本方差的定义:设有\(n\)个样本\(X_1,X_2,...,X_n\),则定义样本方差 \[ S^2 = \frac 1{n - 1} \sum_{i = 1}^n(X_i - \bar{X})^2 \] 其中: \[ \bar{X} = \frac 1n \sum_{i = 1}^nX_i \]

  • \(n\)个样本都是独立同分布的,且每个样本作为随机变量均值为\(\mu\),方差为\(\sigma^2\),则: \[ ES^2 =E(\frac 1{n - 1} \sum_{i = 1}^n(X_i - \bar{X})^2) = \frac 1{n - 1} \sum_{i = 1}^n(E^2(X_i) - 2E(X_i\bar{X}) + E^2(\bar{x}))\\ = \frac 1{n - 1} \sum_{i = 1}^n(\mu^2 + \sigma^2 + \mu^2 + \frac 1n \sigma^2 - 2E(X_i\bar{X}))\\ \because E(X_i\bar{X}) = \frac 1nE(\sum_{1 \le j\le n, j \neq i}X_iX_j) + \frac 1nE({X_i}^2) = \frac 1n\sum_{1 \le j\le n, j \neq i}E(X_i)E(X_j) + \frac 1nE({X_i}^2)\\ = \mu^2 + \frac 1n \sigma^2\\ \therefore ES^2 = \frac 1{n - 1} \times n \times \frac{n - 1}n\sigma^2 = \sigma^2 \]

  • \(\chi^2\) ~ \(\chi^2(n)\),则当\(n\)足够大时,\(\sqrt{2\chi^2}\)近似服从正态分布\(N(\sqrt{2n-1},1)\)

  • t分布:若\(X\) ~ \(N(0,1)\), \(Y\) ~ \(\chi^2(N)\),且\(X\)\(Y\)相互独立,那么定义\(T\) ~ \(t(n) = \frac{X}{\sqrt{Y/n}}\)t分布.

    \(n \to +\infty\)时,\(t(n)\) 可近似为 \(N(0,1)\)

  • F分布:\(X\) ~ \(\chi^2(n_1)\)\(Y\) ~ \(\chi^2(n_2)\),且\(X\)\(Y\)相互独立,那么定义\(F(n_1,n_2) = \frac{X/{n_1}}{Y/_{n_2}}\)F分布

    ​ 若\(F\) ~ \(F(n_1,n_2)\),则有$1F $ ~ \(F(n_2,n_1)\)。同时我们有:\(F_{\alpha}(n_2,n_1) = \frac 1{F_{1 - \alpha}(n_1,n_2)}\)

  • 假设\(X_1,X_2,...,X_n\)均服从\(N(\mu,\sigma^2)\)且相互独立,设\(\bar{X} = \frac 1n \sum_{i = 1}^n {X_i}\)\(S^2 = \frac 1n \sum_{i = 1}^n(X_i - \bar{X})^2\),那么:

    • $ _{i = 1}^n (X_i - )^2$ ~ \(\chi^2(n)\)
    • $_{i = 1}^n (X_i - {X})^2 $ ~ \(\chi^2 (n - 1)\) 这条等价为\((n - 1)\frac {S^2}{\sigma^2}\) ~ \(\chi^2(n - 1)\)
    • \(\frac{\bar{X} - \mu} {\sigma}\sqrt{n}\) ~ \(N(0,1)\)
    • \(\frac{\bar{X} - \mu}{S}\sqrt{n}\) ~ \(t(n - 1)\); (这里的S表示标准差,即\(\sqrt{S^2}\),同时我们可以比较上下发现,当\(n \to \infty\)时,\(t(n - 1) \to N(0,1)\)
  • p96 推论6.3 考前看一下,难记

  • 在正态分布中,不相关和独立是等价的,这是一条很强的结论

  • 正态分布随机抽样得到\(X_1,X_2,...,X_n\)。那么有\(S^2\)\(\overline{X}\)独立,且有\(\tilde{S}\)\(\overline{X}\)独立。

参数估计部分

\(\hspace{1cm}\) 这部分主要用于处理已经知道一系列样本,要用这些样板来估计分布函数里面的一些参数的问题。

  • 矩估计:

    • 若只有一个参数要估计,那只需要用一阶矩来估计即可; 如果有n个参数要估计,那么用前n阶矩来估计

    • 用第\(k\)阶矩来估计的方法:

      • 首先计算出总体的\(k\)阶原点矩,这等于\(E(X^k)\),这应该是一个关于待估计参数的函数
      • 然后计算出样本的\(k\)阶原点矩,这等于\(\frac 1n \sum_{i = 1}^n {X_i}^k\) ,这算出来是一个确定的数
      • 令样本的\(k\)阶原点矩等于总体的\(k\)阶原点矩,得到一个方程用来解待估计的参数
    • 更加一般化地说,就是我们可以用样本的\(k\)阶原点矩作为总体的\(k\)阶原点矩。例如,我们知道样本的\(1\)阶原点矩是\(a\),样本的\(2\)阶原点矩是\(b\),现在要求样本的标准差的矩估计。

      那么,我们就可以认为\(EX = a\), \(EX^2 = b\),于是\(D = EX^2 -(EX)^2 = b - a^2\),所以我们就认为\(\sqrt{b - a^2}\)是样本的标准差的矩估计。

    • 我们有如下记号约定:对于一个样本\((X_1,X_2,...,X_n)\),令 \(\tilde{S}^2 = \frac 1n \sum_{i = 1}^n (X_i - \overline{X})^2\)

    • 在矩估计法中,认为\(DX = \tilde{S}^2\)

      证明如下: \[ \tilde{S}^2 =\frac 1n \sum_{i = 1}^n (X_i - \overline{X})^2 \\ = \frac 1n \sum_{i = 1}^n{X_i}^2 - \frac 2n \sum_{i = 1}^n{X_i}\overline{X} + \overline{X}^2\\ = \frac 1n \sum_{i = 1}^n{X_i}^2 - 2\overline{X}^2 + \overline{X}^2 = \frac 1n \sum_{i = 1}^n{X_i}^2 - \overline{X}^2\\ = EX^2 - (EX)^2 = DX \]

  • 最大似然估计

    即为要让参数取该估计值时,样本发生的概率最大

  • \(\hat{\theta} = \hat{\theta}(X_1,X_2,...,X_n)\)是参数\(\theta\)的估计量,若对于任意的\(\theta \in \Theta\),都有: \[ E(\hat{\theta}) = \theta \] 那么称\(\hat{\theta}\)\(\theta\)的无偏估计。

    设总体方差\(DX=\sigma^2\)存在,那么\(S^2 = \frac 1{n - 1} \sum_{i = 1}^n (X_i - \overline{X})^2\)\(\sigma^2\)的无偏估计,而\(\tilde{S}^2 = \frac 1n \sum_{i = 1}^n(X_i - \overline{X})^2\)\(\sigma^2\)的有偏估计。

概率论好题整理

  1. \(X,Y\)独立,且\(X\) ~ \(P(\lambda_1)\),\(Y\) ~ \(P(\lambda_2)\)

    1. 证明\(X + Y\) ~ \(P(\lambda_1 + \lambda_2)\); (2)求在已知\(X + Y = m\)的条件下\(X\)的分布
  2. 已知随机变量\(X\)\(Y\)相互独立,且都服从\(N(0,1)\),求\(E[max\{X,Y\}]\)

  3. 已知\(X\) ~ \(N(\mu, \sigma ^2)\),求\(DX\)

    首先\(EX = \mu\), 然后: \[ DX = E((X - EX)^2) = E((X - \mu)^2) = \int_{-\infty}^{+\infty} \frac 1 {\sqrt{2\pi}\sigma}e^{-\frac {(x - \mu)^2}{2\sigma ^2}}\times (x - \mu)^2 dx\\ = \int_{-\infty}^{+\infty} \frac 1 {\sqrt{2\pi}\sigma} e^{\frac{-x^2}{2\sigma^2} }\times x^2 dx = \int_{-\infty}^{+\infty} \frac 1 {\sqrt{2\pi}\sigma} e^{-(\frac x{\sqrt 2 \sigma})^2} \times (\frac x{\sqrt 2 \sigma})^2d(\frac {x}{\sqrt 2 \sigma}) \times 2\sqrt 2 \sigma ^3\\ = \frac{2\sigma ^2}{\sqrt{\pi}} \int_{-\infty}^{+\infty} e^{-x^2} x^2dx \]

​ 考察积分\(\int_{-\infty}^{+\infty}e^{-x^2}x^2dx\)\[ \int_{-\infty}^{+\infty} e^{-x^2}dx = \left. xe^{-x^2}\right|_{-\infty}^{+\infty} + 2\int_{-\infty}^{+\infty}e^{x^2}x^2 dx = \sqrt{\pi} \] ​ 于是: \[ \int_{-\infty}^{+\infty}e^{x^2}x^2 dx =\frac {\sqrt \pi}2 \] ​ 所以: \[ DX = \sigma^2 \]

  1. 已知\(X\) ~ \(N(0,\sigma^2)\), \(Y\) ~ \(N(0,\sigma ^2)\),求证:\(X + Y\) ~ \(N(0, 2\sigma^2)\)\(X - Y\) ~ \(N(0, 2\sigma^2)\)

超级强的一个结论:\(X\) ~ \(N(\mu_1,{\sigma_1}^2)\)\(Y\) ~ \(N(\mu_2,{\sigma_2}^2)\),则\(X + Y\) ~ \(N(\mu_1+\mu_2,{\sigma_1}^2+{\sigma_2}^2 )\)

\(X - Y\) ~ \(N(\mu1 - \mu 2,{\sigma_1}^2+{\sigma_2}^2 )\)

  1. 设随机变量\(X\)\(Y\)相互独立,且都服从\(N(0, \frac 12)\)分布,求\(E|X - Y|\)以及\(D|X-Y|\) \[ E|X-Y| = \sqrt{\frac 2 \pi}\ \ \ \ \ \ \ D|X-Y| = 1 - \frac 2 \pi \]

  2. 设随机变量\(x\)具有概率密度: \[ f(x) = \begin{cases} \frac{x^m}{m!} e^{-x}\ \ \ \ \ \ \ \ x \ge 0\\ 0 \ \ \ \ \ \ \ \ \ \ \ \ \ else\end{cases} \] 其中\(m\)为正整数。请证明:\(P(0 \le X \le 2(m + 1)) \ge \frac m{m + 1}\)

    hint: 实际上这个概率密度函数和欧拉给出的 x! 的连续函数的被积函数很像

  3. 设随机变量\(X,Y\)独立,且分别服从参数为\(\lambda\)\(\mu\)的泊松分布,求\(E(X|X + Y = m)\)

    \(ans:\frac{m\lambda}{\lambda + \mu}\)

  4. 假设随机变量\(X_1\)\([0,1]\)上有均匀分布,\(X_i\)\([X_{i-1},X_{i-1}+1]\)上有均匀分布,其中\(i = 2,3,...,n\),求\(EX_n\)

    解:首先\(E(X_i|X_{i - 1}) = X_{i - 1} + \frac 12\),根据全期望公式,有

    \[ E(X_i) = E(E(X_i|X_{i - 1})) = E(x_{i - 1} + \frac 12) = EX_{i - 1} + \frac 12 \]\(E_1 = \frac 12\),所以\(E_n = \frac n2\)

  5. 已知\(X\) ~ \(N(0,1)\),求\(E(X^2),D(X^2)\) \[ E(X^2) = E^2(X) + D(X) = 1\\ E(X^4) = \int_{-\infty}^{+\infty}\frac 1{\sqrt{2\pi}}e^{-\frac{x^2}2}x^4dx\\ while\ \ \ \int_{-\infty}^{+\infty}e^{-x^2}x^4dx = \frac 34 \sqrt{\pi}\ \ (This \ \ is \ \ similar \ \ to \ \ calculating \ \ \int_{-\infty}^{+\infty}e^{-x^2}x^2 dx)\\ \therefore E(X^4) = 3\\ \therefore D(X^2) = E(X^4)-E^2(X^2) = 3 - 1 = 2 \]

  6. 设总体\(B\) ~ \(B(1,p)\),\(p\)为未知参数,\((X_1,X_2,X_3,...,X_n)\) 为来自总体\(X\)的样本,求\(p\)的极大似然估计

    第一种做法是\(L(\theta) = \prod(X_ip + (1-X_i)(1-p))\),这种后续取对然后求驻点是做不下去的

    需要写成\(L(\theta) = \prod(p^{X_i}(1 - p)^{1 - X_i})\),然后后续取对然后求驻点可以做

  7. 设总体\(X\)的密度函数为: \[ f(x) = \begin{cases}\frac 1 \theta e^{-\frac{x - \mu}\theta}\ \ \ \ x \ge \mu \\ 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x < \mu \\ \end{cases} \] 其中\(\theta > 0\), 求未知参数\(\mu, \theta\)的矩估计量

注意到:\(X - \mu\) ~ \(E(\frac 1\theta)\)

所以:\(E(X - \mu)\) = \(\theta\), \(D(X - \mu) = \theta^2\), 所以:\(EX = \theta + \mu, DX = \theta ^2\), 所以:\(\overline{X} = \theta + \mu, \tilde{S}^2 = \theta^2\) 所以:\(\theta = \tilde{S}, \mu = \overline{X} - \tilde{S}\)

  1. \(X\)\(Y\)不相关,是否有\(f(X)\)\(g(Y)\)不相关?
  2. \(X_1,X_2,...,X_n\)独立同分布,是否有\(\overline{X}\)\(S^2\)不相关?

ICS Notes

  • Carry Flag将两个操作数视作无符号数,无符号数加减溢出的时候Carry Flag会被置为1。在实际执行的加法中,如果两个操作数的符号位相等,但不等于结果的符号位,CF会被置为1,反之为0. 在减法中,如果不够减,那么CF被置为1.

CF (bit 0) Carry flag — Set if an arithmetic operation generates a carry or a borrow out of the most significant bit of the result; cleared otherwise.

  • Over Flag将两个操作数视为有符号数。加法时,若CF=1表示加法有进位。减法时,若CF=1表示不够减。

  • SHL 逻辑左移,每左移一次,把最高位移到CF里,并在低位补0
  • SHR 逻辑右移,每右移一次,把最低位移到CF里,并在高位补0
  • SAL 算术左移,和SHL一样,只不过若移位前后sign bit发生变化,那么OF会被置为1
  • SAR 算术右移,每右移一次,把最低位移到CF里,并在高位补符号位
  • ROL 循环左移,每左移一次,把最高位放到最低位,同时把最高位放到CF里
  • ROR 循环右移,每右移一次,把最低位放到最高位,同时把最低位放到CF里
  • RCL 把CF当成最高位加到原来的数左边,一起循环左移。
  • RCR 把CF当成最低位加到原来的数右边,一起循环右移。

  • JCXZ/JECXZ: jump if CX/ECX is 0

  • MUL

    • 单操作数 例如MUL A 意思是DX:AX = (AX * A)
    • 双操作数 例如MUL S, D 意思是 D *= S
    • 三操作数 例如MUL n,S,D 意思是 D = S * n, 这里D必须是寄存器
  • IMUL: 与MUL同理

  • 机器码里面的操作数如果是立即数存储方法也是小端序的,例如e8 6e ff ff ff实际上的操作数是ff ff ff 6e


编译器方面的优化方式:

  1. 提高cache的命中率,例如把二位数组求和改成行优先
  2. 提高流水线的命中率,例如①把循环拆开,②采用cmov指令
  3. 提高单条指令的运行速度,例如把lea 0(%eax,%eax,3) %eax 改为 SAL $2,%eax
  4. 使用串指令,例如在把一个数组的全部赋为0的任务中,使用memset这一封装了串指令的命令
  5. 使用多线程,例如在把一个数组的全部赋为0的任务中,开多个线程分别对数组的各个部分赋0
  6. 使用SIMD:一条指令成组操作,节约了时间
  7. 变量与寄存器绑定

C语言方面的优化方式:

  1. 把*2改为<< 1,提高单调指令的效率

  2. 循环展开,提高流水线效率

  3. 二位数组行优先访问,提高cache命中率

  4. 使用memset等封装了串指令的命令

  5. register int i

  6. if (A && B) 如果A满足概率90%, B 10%,那么写成if (B && A)

  7. 把递归改为迭代

  8. 用封装了SIMD的代码

  9. 多线程

  10. 减少重复计算


异常:CPU内部产生的错误; 中断:由外部设备产生的“外部事件”


Question:

  • rela.data节具体是怎么存储信息的?之前在实验四用readelf看的时候没有看到过rela.data节

    存储信息方式同rela.text;

  • 软中断(int命令)是异常还是中断?

    是异常

  • 栈帧栈顶地址是什么意思?是存返回地址的那个地方是栈顶吗?

    答:不是,是%esp的地址。栈帧的范围是从%ebp到%esp之间(闭区间)

  • 我想要让代码跳到0x4011d6处执行,对应的汇编代码不能直接写jmp 0x4011d6,而是得写成

    考试时可以这样写

    1
    2
    mov $0x4011d6,%r10
    jmp *%r10

    这是为什么?

    另外,如果我新建一个.s文件然后在里面写jmp 0x4011d6然后汇编,然后再反汇编,得到的反汇编结果是jmp 0x05,这是为什么?

  • 一般来说,用switch效率是比用很多if要高吗?

    不一定,按习题来

  • extern定义的变量名会不会出现在符号表里面(因为如果不用就不会)?这种变量是不是没有强弱符号的概念?

  • 在同一个.o文件下,要么全是R_386_32重定位方式,要么全部都是R_386_PC32重定位方式? 看起来不是:(

  • 栈顶是低地址

  • b8 mov literal, a1 mov 直接寻址

  • 外部变量不会在.data,.bss,.text节中

假设LP是一个标号,那么JMP LP等价的语句是:

LEA LP, %EAX

JMP *%EAX

字符之间的比较最好用ja,jb

.data 段里面定义的变量使用的时候可以替换为它的地址。

所有的符号(包括外部符号)只要被引用了就要重定位

如果\(x\)在C语言中是一个无符号数,那么x >>= 1翻译成汇编语言就会时shr

如果\(x\)在C语言中是一个有符号数,那么x >>= 1翻译成汇编语言就会时sar

基址加变址寻址:A(B,C,D)中B和C都必须要是寄存器

没直接问寻址方式就详细答,不放心全都写上

中断描述符表就是中断向量表。

欧拉函数&线性筛

1. 定义

\(\varphi(n)\) 表示所有与\(n\)互质的正整数\(i\)的个数,其中\(i <= n\).

2. 欧拉定理

欧拉定理是指对于任意互质的正整数\(a\)\(n\),有:

\[ a^{\varphi(n)} \equiv 1\pmod{n} \]

3. 证明

记小于\(n\)且与\(n\)互质的正整数集合\(R\)为:

\[ R = \{ r_1,r_2,r_3,……,r_{\varphi(n)}\} \]

再令集合S为:

\[ S = \{a\times r_1 \% n,a \times r_2 \% n,……,a\times r_{\varphi(n)}\%n\} \]

现在我们先证明\(S\)中的元素两两互不相同。

反证法:假设存在两个整数\(i\)\(j\)(方便起见,令\(i < j\))使得\(a \times r_i \% n = a \times r_j \% n\),则:

\[ a \times r_i \equiv a \times r_j \pmod{n} \]

又因为\(a\)\(n\)互质,所以\(r_i = r_j\),而\(i < j\),即 \(r_i \neq r_j\),矛盾,所以\(S\)中元素两两互不相同。

又因为\(a\)\(n\)互质,\(r_i\)与 n互质,所以易得\(a * r_i \% n\)\(n\) 互质,即\(S\)中的元素都与 \(n\) 互质。

所以可以得到 \(R = S\).

所以 :

\[ \prod_{i = 1}^{\varphi(n)}{r_i} = \prod_{i = 1}^{\varphi(n)}{a \times r_i\%n} \]

\[ \prod_{i = 1}^{\varphi(n)}{r_i} \equiv \prod_{i = 1}^{\varphi(n)}{r_i \times a} \pmod{n} \]

\[ \prod_{i = 1}^{\varphi(n)}{r_i} \equiv a^{\varphi(n)} \times \prod_{i = 1}^{\varphi(n)}{r_i} \pmod{n} \]

又因为$_{i = 1}^{(n)}{r_i} $ 与 n 互质,

所以: \[ a^{\varphi(n)} \equiv 1 \pmod{n} \]

4.线性筛

先上代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include<iostream>
#include<cstdio>
#include<cmath>
using namespace std;

const int maxn = 1e6 + 5;
int prime[maxn],px,l,r,n;
bool b[maxn];

int main() {
scanf("%d",&n);
for (int i=2; i<=n; i++) {
if (!b[i])
prime[++px] = i;
for (int j=1; j<=px && i * prime[j] <= n; j++) {
b[prime[j] * i] = 1;
if (i % prime[j] == 0)
break;
}
}
for (int i=1; i<=px; i++)
printf("%d\n",prime[i]);
return 0;
}

算法的正确性证明中,其实我们要证明的就是两点:

  • 每一个合数都会被筛掉。
  • 每一个合数只能被筛一遍(以保证时间复杂度\(O(n)\))

对于第一点,证明如下:

设这个合数为\(v\),则 \(v = p_1 p_2 …… p_k(p_i<=p_{i+1})\)\(p\)\(v\)的质因数。

可以发现,当\(i = p_2 \cdot p_3 \cdot …… \cdot p_k\), \(prime[j] = p_1\)时,\(i\)与所有小于\(p_1\)的质数互质,所以中途不会被break掉,所以此时\(v\)可以被筛掉。

对于第二点,证明如下:

在第一点中,我们是证明了v会在\(i = p_2 \times p_3 \times …… \times p_k\), \(prime[j] = p_1\)时被筛掉,那么现在就是要证明除了这种情况外,v不会被筛掉。

设当\(i = p_1 \times p_2 \times ... \times p_{m - 1} \times p_{m + 1} \times ... \times p_k\) \((p_i <= p_{i + 1})\)

\(prime[j] = p_m(p_m \neq p_1)\)时 v 被筛掉.

可以发现 \(p_1 | i\),所以当\(prime[j] = p_1\)时就会被break掉,所以v不会在此时被筛掉。

由此,我们可以证明,算法的正确性和时间复杂度的正确性。

5.求欧拉函数

首先,欧拉函数是一个积性函数。

积性函数的定义是:函数\(f(x)\),若满足\(f(m \times n) = f(m) \times f(n) (gcd(n,m) = 1)\),那么\(f(x)\)就是一个积性函数。

欧拉函数是一个积性函数。

设$n = p_1^{m_1} p_2^{m_2} p_3^{m_3} …… p_k^{m_k} \(,\)p_iP$,则有:

\[ \varphi(n) = \varphi(p_1^{m_1}) \times \varphi(p_2^{m_2}) \times …… \times \varphi(p_k^{m_k}) \]

对于\(\varphi(p^k),p_i \in P\),易得\(\varphi(p^k) = p^k - p^{k - 1} = p^k \times (1 - \frac{1}{p})\)

所以有:

\[ \varphi(n) = p_1^{m_1} \times p_2^{m_2} \times p_3^{m_3} \times …… \times p_k^{m_k} \times (1 - \frac{1}{p_1}) \times (1 - \frac{1}{p_2}) \times …… \times (1 - \frac{1}{p_k}) \\= n \times (1 - \frac{1}{p1}) \times (1 - \frac{1}{p2}) \times …… \times (1 - \frac{1}{p_k})) \]

特别的,当\(n \in P\)时,\(\varphi(n) = n - 1\).

理解了上面这些,可以上代码了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include<iostream>
#include<cstdio>
using namespace std;

const int maxn = 1e6 + 5;
int n,phi[maxn],prime[maxn],px;
bool b[maxn];

int main() {
scanf("%d",&n);
for (int i=2; i<=n; i++) {
if (!b[i]) {
prime[++px] = i;
phi[i] = i - 1;
}
for (int j=1; j<=px && prime[j] * i <= n; j++) {
b[i * prime[j]] = 1;
if (i % prime[j] == 0) {
phi[i * prime[j]] = phi[i] * prime[j];
break;
}
phi[i * prime[j]] = phi[i] * phi[prime[j]];
}
}
phi[1] = 1;
for (int i=1; i<=n; i++)
printf("%d\n",phi[i]);
return 0;
}

可以看到,这个代码是由线性筛变过来的。

\(i \in P\)时,毫无疑问\(\varphi(i) = i - 1\).

\(i \notin P\)时,分以下两种情况讨论:

  • \(i \% prime[j] \neq 0\),因为\(prime[j] \in P\),所以\(gcd(i,prime[j])) = 1\)。因此根据积性函数的定义,有:

\[ \varphi(i \times prime[j]) = \varphi(i) \times \varphi(prime[j]) \]

  • \(i \% prime[j] = 0\),因为\(prime[j] \in P\),所以\(prime[j]\)必定为 i 的一个质因数,所以\(prime[j]\)必定是\(p_1,p_2,p_3,……,p_k\)中的一个,因此得到: \[ \varphi(i \times prime[j]) = i \times prime[j] \times (1 - \frac{1}{p1}) \times (1 - \frac{1}{p2}) \times …… \times (1 - \frac{1}{p_k})) = \varphi(i) \times prime[j] \]

最后和线性筛一样,可以证明欧拉函数求法时间复杂度也是线性的。

特别的,\(\varphi(1) = 1\).

6.一些神奇定理

  • \(\sum_{d|n}\varphi(d) = n\)

  • \(\forall n > 1,1-n\)中与n互质的数的和为\(n \times \varphi(n) / 2\)

  • \(若gcd(a,n) = 1\) $ 正整数b$ \(a^b \equiv a^{b \ mod \ \varphi(n)}(mod\ n)\)

  • \(a\),\(n\)互质,则满足\(a^x \equiv 1 (mod \ n)\)的最小正整数\(x_0\)\(\varphi(n)\)的约数。

gdb

  • $esp表示寄存器%esp

  • 可以在特定机器指令的地址上打断点,例如

    b *0xff3ef23f

    也可以这样:

    b *phase_1 + 22

  • 寄存器%pc%eip中存放的值貌似都是将要执行的下一条指令所在地址

    x/1i $pc可以显示将要执行的下一条指令

  • call命令会先把寄存器%ip压栈,然后把%ip赋值为调用的函数开头命令的地址

  • ret命令相当于就是pop %ip

A

​ Given two integer \(n\) and \(m\), among all the sequence containing n non-negative integers less than \(2^m\), you need to count the number of such sequences \(A\) that there exists a non-empty subsequence of \(A\) in which the bitwise AND of the integers is \(1\).

​ Since the answer may be very large, output it modulo a positive integer \(q\).

Input

​ The only line contains three integers \(n,m,q\). (\(1 \le n,m \le 5000, 1 \le q \le 10^9\))

Output

​ Output a line containing an integer, denoting the answer.

Sample input 1

1
2 3 998244353

Sample output 1

1
17

Sample input 2

1
5000 5000 998244353

Sample output 2

1
2274146

Solution:牛客2024D1TA_哔哩哔哩_bilibili

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include<bits/stdc++.h>
#define f(i,l,r) for (int i=l; i<=r; i++)
#define int long long

using namespace std;

const int maxn = 5e3 + 5;
int n,m,mo;
int C[maxn][maxn];

void pre() {
f(i,1,5000)
C[i][i] = C[i][0] = 1;
f(i,2,5000)
f(j,1,i - 1) {
C[i][j] = (C[i - 1][j] + C[i - 1][j - 1]) % mo;
}

}

int ksm(int x,int y) {
int ans = 1;
while (y) {
if (y & 1)
(ans *= x) %= mo;
y >>= 1;
(x *= x) %= mo;
}
return ans;
}

signed main() {
cin>>n>>m>>mo;
pre();
if (m == 1) {
int ans = ksm(2,n) - 1;
if (ans < 0) ans += mo;
cout<<ans<<'\n';
return 0;
}
int ans = 0;

f(i,0,n) {
int tmp = ksm(ksm(2,i) - 1,m - 1);
if (tmp < 0) tmp += mo;
int oth = ksm(ksm(2,n - i),m - 1);
int cnt = ((tmp * oth) % mo * C[n][i]) % mo;
(ans += cnt) %= mo;
}
if (ans < 0) ans += mo;
cout<<ans<<'\n';
return 0;
}

B

​ Given two integer \(n\) and \(m\), among all the sequence containing n non-negative integers less than \(2^m\), you need to count the number of such sequences \(A\) that there exists at least two different non-empty subsequence of \(A\) in which the bitwise AND of the integers is \(1\).

​ Since the answer may be very large, output it modulo a positive integer \(q\).

Input

​ The only line contains three integers \(n,m,q\). (\(1 \le n,m \le 5000, 1 \le q \le 10^9\))

Output

​ Output a line containing an integer, denoting the answer.

Sample input 1

1
2 3 998244353

Sample output 1

1
7

Sample input 2

1
5000 5000 998244353

Sample output 2

1
530227736

Solution: 牛客2024D1TB_哔哩哔哩_bilibili

Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
#include<bits/stdc++.h>
#define f(i,l,r) for (int i=l; i<=r; ++i)
#define int long long

using namespace std;

const int maxn = 5e3 + 5;
int n,m,mo,ans1,ans2;
int C[maxn][maxn],dp[maxn][maxn];
int f[maxn];

int ksm(int x,int y) {
int ans = 1;
while (y) {
if (y & 1)
(ans *= x) %= mo;
y >>= 1;
(x *= x) %= mo;
}
return ans;
}

void pre() {
f(i,1,5000)
C[i][i] = C[i][0] = 1;
f(i,2,5000)
f(j,1,i - 1) {
C[i][j] = (C[i - 1][j] + C[i - 1][j - 1]) % mo;
}
f[0] = 1;
f(i,1,5000)
f[i] = (f[i - 1] << 1) % mo;
//dp[i][j] = ksm(2^i - 1 - j, m - 1)
dp[0][0] = 0;
f(i,1,n) { // j in [0,i]
for (int j=1; j<=i; j+=2)
dp[i][j] = (dp[i - 1][((1 + j) >> 1) - 1] * f[m - 1]) % mo;
for (int j=0; j<=i; j+=2)
dp[i][j] = ksm(f[i] - 1 - j,m - 1);
}
}

void solve1() {
pre();
int ans = 0;

f(i,0,n) {
int v = f[i] - 1; if (v < 0) v += mo;
int tmp = (ksm(f[i], m - 1) - ksm(v,m - 1)) % mo;
if (tmp < 0) tmp += mo;
int oth = ksm(f[n - i],m - 1);
int cnt = ((tmp * oth) % mo * C[n][i]) % mo;
ans += cnt;
}
ans = (ksm(f[n],m) - ans) % mo;
if (ans < 0) ans += mo;
ans1 = ans;
}
signed main() {
// freopen("in.in","r",stdin);
// freopen("1.out","w",stdout);
ios::sync_with_stdio(0); cin.tie(0); cout.tie(0);
cin>>n>>m>>mo;
if (mo == 1) {
cout<<"0\n";
return 0;
}
solve1();
if (m == 1) {
int final = f[n] - n - 1; final %= mo; if (final < 0) final += mo;
cout<<final<<'\n';
return 0;
}

f(i,0,n) { // i: the number of how many entry with the least significant bit being 1
if (i > m - 1) break;
int sum = 0;
int p = f[i] - 1;
f(j,0,i) {
//dp[i][j] = ksm(2^i - 1 - j, m - 1)
int tmp = (dp[i][j] * C[i][j]) % mo;
if (j & 1) tmp = (~tmp + 1);
sum += tmp;
}
sum %= mo;
// if (sum < 0) sum += mo;
int oth = ksm(f[n - i],m - 1);
int loc = ((sum * oth) % mo * C[n][i]) % mo;
ans2 += loc;

}
int final = (ans1 - ans2) % mo; if (final < 0) final += mo;
cout<<final<<'\n';
return 0;
}

/*
int sum = 0;
f(j,0,i - 1) {
int tmp = ksm(n - j,m - 1) * C[i][j];
if (j & 1) tmp = -tmp; tmp %= mo;
(sum += tmp) %= mo;
}
*/

CSAPP notes

Lecture 2

  • Argument X X = 10100010
    << 3 X = 00010000
    logical >> 2 X = 00101000
    Arithmetic >> 2 X = 11101000

    Logical right shift would fill the high digit with zero.

    Arithmetic right shift would fill the high digit with the sign digit.

  • In real world practice, when you do X << Y in the machine, most machine would actually do X << (Y mod 8) (However, when I test it locally, this mod operation didn't apply)

  • The difference between && and &

    • && would only return 0 or 1, while & can return all kinds of integers
    • && would consider any non-zero integer as 1, and consider 0 as 0. & certainly don't have this trait
    • When the outcome of an expression is known, && would not calculate the rest of the expression. For example, if int x = 0, and we calculate expression a&&(5/a), it would not raise an DIVIDEZERO error as the second part would not be calculated. Similarly, if p is a null pointer, then p&&*p++ would not access a null pointer as the second part would be omitted.
  • If you shift left by a negative number, the outcome might not be a right shift. For example, 1 << (-2) might not be 1 >> 2

  • A different way to understand two's complement

    For example, the two's complement of a number is 10110001, then the value of this number should be: \[ 1\times (-2^7) + 0 \times 2^6 + 1 \times 2^5 + 1 \times 2^4 + 0 \times 2^3 + 0 \times 2^2 + 0 \times 2^1 + 1 \times 2^0 \]

    In other word, when it comes to two's complement, we consider the weight of the highest digit to be negative. More accurately, the weight of the highest digit should be the opposite number of its original weight.

  • If a comparison operation involves two signed numbers, then it would do it signed ways. If a comparison operation involves two unsigned umbers, then it would do it unsigned ways.

    However, if a comparison operation involves a signed number and an unsigned number, then it would first transform the signed number into unsigned form then compare these two unsigned numbers.

    For example, if int x = -1; unsigned int y = 0, then x would be greater than y. In other word, x > y would be true.

    This rule applies to other operations, including add, minus, etc.

  • Define int x = -2147483648 might make the compiler confused for obscure reason. You can do the same thing by int x = -2147483647 - 1

  • An example:

    1
    2
    3
    int i;
    for (i = 10; i - sizeof(char) >= 0; i--)
    printf("%d\",i);

    For this program, what you will get is just an endless loop. Why?because sizeof(char) is an unsigned number, making the whole bunch of i - sizeof(char) to be considered unsigned. Thus, i - sizeof(char)>=0 always holds, making the loop endless.

  • How to promote a smaller type of integer to a larger type of integer?

\(\hspace{1cm}\) For example, if we have a variable x of type int, how do we cast it to type long long?

\(\hspace{1cm}\) Simply by repeatedly copying the sign digit and insert it to the left until the number of digit reach the word size of long long.

\(\hspace{1cm}\) This is both true when it is a two's complement number and when it is an unsigned number.


Lecture 3

  • Addition for unsigned number (A.K.A. UAdd) :

    Computer will just add the two numbers up and abandon the higher digit if it exceeds the word size.

    For unsigned number u and v,perform addition between them would lead to the result of \((u + v) \% 2^w\) , with w representing the word size.

  • Addition for two's complement number (A.K.A TAdd):

    Computer will do the same thing in UAdd: Add the two numbers up and abandon the higher digit if it exceeds the word size.

    For example, if we have a 4-digit two's complement number u = 1101 and v = 0101. So u is actually -3 and v is actually 5. Thus, u + v should be 2 arithmetically. When we add the 1101 and 0101, we got 10010, and we abandon the most significant digit and we got 0010, which is 2 and it matches the arithmetic result.

    Another example: u = 1101, v = 1010, so u is actually -3 and v is actually -6, so u + v should be -9 arithmetically. If we add 1101 and 1010, we got 10111. Abandon the highest digit we got 0111, which is 7.

  • UAdd and TAdd have identical Bit-level behavior

  • Multiplication for unsigned number:

    Similar to addition, computer would first calculate the product, then abandon the higher digit exceeding the word size. Simply put, \(u \times v\) would be \((u \times v) \% (2^w)\), with \(w\) representing the word size

  • Multiplication for two's complement number:

    Computer would first calculate the product, then abandon the higher digit exceeding the word size.

    For example, if u=0101 and v = 0101, then \(u \times v = 11001\). Abandon the highest digit we got 1001 , that would be -7.

  • A property for two's complement

    Assume a number x can be represented \((\overline{a_1 a_2 a_3 ... a_n})_2\) in two's complement. Let's denote \((\overline{a_1 a_2 a_3 ... a_n})_2\) as \([a_1,a_2,...,a_n]\). And we could assert that: if \(x = [a_1,a_2,...,a_n]\), then: \[ -x = [1 - a_1,1 - a_2, ... , 1 - a_n] +1 \] Notice: the last "+" operation will be done in computer ways, which means if it overflows, then the highest digit would be omitted.

    For example, 0 = [0,0,0,0], so -0 = [1,1,1,1] +1 = [1,0,0,0,0], then we omit the highest number, then we could get [0,0,0,0], and the result is correct.

  • It appears that in my computer, when you do a ">>" operation then it would actually perform arithmetic >>

  • In a 64-bit word size machine, the address of the memory would be 8-bytes (which is also 64 bits). Thus, if we do printf("%d\n",sizeof(int*)), then it would probably get "8".

  • Terminology time:

    Word means a range of memory unit storing a specific type of data. For example, an int word is comprised of 2 bytes, an long long word is comprised of 4 bytes.

  • So, how are the bytes within a multi-byte word ordered in memory?

    Two ways: Big Endian; Little Endian;

​ Little Endian is the major way today.

​ The internet still use Big Endian.

  • The way to store strings:

    Computer would just store the string from head to tail char-wise. There's no big-endian or little-endian here.

​ (Notes: IA32 is a little-endian machine, and Sun is a big-endian machine)


Lecture 4

  • An example

    1
    printf("%d\n",(2/3)==(2/3.0));

    In fact, this program would output 0. Because (2/3) is an integer while (2 / 3.0) is a double number. Thus, the equation can not hold.

  • Implicit Conversion

    In the C language, when an operation involves multiple types of variables, type conversions occurs to make sure that the type of different variables match. In operations, implicit conversions take place. Type conversions can also be called as cast.

    • All integer type smaller than int (like short, char) will be promoted to int
    • Low precision type will be cast to high precision type.(int would be cast to float, float would be cast to double)
  • Immediate Value

    In the C language, immediate value refers to a constant value that is directly specified in code.

    The immediate value is also known as a literal.

    In the C language, the immediate value 3.14 are treated as type duoble by default.

  • IEEE floating point representation

    The IEEE float point representation represents a number in the form \(V = (-1)^s \times M \times 2^E\)

    The sign s determine whether this number is positive or negative

    The exponent E weights the value by a (possibly negative) power of 2.

    Floating-point number are represented in three fields above.

    • The single sign bit in the front directly encode sign s

    • The \(k-bit\) exponent field \(exp = e_{k-1}e_{k - 2}...e_0\) encodes the exponent \(E\)

    • The \(n - bit\) fraction field \(frac = f_{n - 1}f_{n - 2}...f_0\) encodes the significand \(M\)

    • Normalized Case

      When exp is not all 0 and exp is not all 1,it is the normalized case

      In this case, \(E = e - bias\),where \(bias = 2^{k - 1} - 1\) and \(e\) is the unsigned number having the bit representation of \(e_{k - 1}e_{k- 2}...e_0\).

      In this case, \(M = (1.f_{n - 1}f_{n - 2}...f_0)_2\)

    • Denormalized Case

      When exp is all 0, it is the denormalized case, used to represent floating-point number that is close to zero.

      In this case, \(E = 1 - bias\), where \(bias = 2^{k - 1} - 1\).

      In this case, \(M = (0.f_{n - 1}f_{n - 2}...f_0)_2\)

    • Special Case A

      When exp is all 1, and frac is all 0, this number means infinity. By assigning s to 1 or 0, we could get positive infinity and negative infinity.

    • Special Case B

      When exp is all 1, and frac is not all 0, this number means NaN (Not a number). Such values are returned as the result of an operation where the result can not be given as a real number or as infinity, as when computing \(\sqrt{-1}\) or \(\infty - \infty\) .

  • Why do we have to define denormalized case?

    In normalized case, we can not represent 0.

    In denormalized case, we can represent 0 as well as some numbers really close to 0.

  • Floating-point Operation

    The IEEE standard specifies a simple of way of floating-point operation. Viewing the two number as real numbers and just do the operation in mathematical ways, then round the mathematical result to yield the final result.

    If one of the value is special (such as NaN, infinity,0), the IEEE standard specify a convention that attempt to be reasonable. To be specific, 1/-0 is defined to yield \(-\infty\), \(\infty - \infty\) is defined to yield NaN.

  • The floating-point addition is commutative but not associative. For example, a float x = 3.14 + 1e20 - 1e20 would yield 0, while a float x = 3.14 + (1e20 - 1e20) would yield 3.14

  • The floating-point multiplication is commutative but not associative. For example, 1e20 * 1e20 * 1e-20 would yield \(+\infty\), while 1e20 * (1e20 * 1e-20) would yield 1e20

    The floating-point multiplication does not distribute over addition.

  • Conversion between int, float and double:

    • int -> float: The number will not overflow, but it may be rounded.
    • int or float -> double: The number will not overflow and it will not be rounded. Because it has both greater range and greater precision.
    • double -> float: The number might overflow(leading to \(+\infty\) or \(-\infty\)) or be rounded.
    • float or double -> int: The number would be rounded toward \(0\).
  • Infinity can be compared. It is compared in mathematical ways.

  • For floating-point number \(a\),\(b\),\(c\)

    • if $ a> b$, then \(a + c > b + c\)
    • If \(a > b\), then \(-a < -b\)
    • For every floating-point number \(x\), we have \(x \times x \ge 0\)

Data lab


Lecture 5

  • how to assemble a file

    1
    gcc -S test.c

    Can assemble test.c into assemble code test.s

  • how to disassemble an object code file

    1
    objdump -d test > test.d

    Can disassemble the object code file test into assemble code file test.d

  • Use gdb to disassemble a function:

    Suppose we have a program code test.c and we want to disassemble the function build which is in this program.

    First we have to compile test.c:

    1
    gcc -Og -o test test.c

    Then we type

    1
    gdb test

    Then we would get into the gdb interface.

    After it, we type

    1
    disassemble build

    and we could get the disassemble code

  • The concept of instruction set:

    There are two kinds of prevalent instruction set, intel x86-64 and ARM. Most intel chip use x86-64 while mobile phone chip and apple chip use ARM. The course is about x86-64

  • The assembly code usually implement data manipulation in memory and register. There are 16 register, and there names are as follow:

    %rax %r8
    %rbx %r9
    %rcx %r10
    %rdx %r11
    %rsi %r12
    %rdi %r13
    %rsp %r14
    %rbp %r15
  • the syntax %rax refers to the register %rax, and the syntax (%rax) refers to the value in register %rax

  • A general way to designate address: \[ D(R_b, R_i, S) \] \(D\) is a constant, \(R_b\) and \(R_i\) are the name of two general-purpose register, and the \(S\) is the scaling factor restricted to be 1, 2, 4, or 8. This expression specifies an address, which would be: \[ D + ([numbers \ \ in \ \ R_b] + S \times [numbers \ \ in \ \ R_i]) \] For example, provided that the number stored in \(R_b\) is 0x100, the number stored in \(R_i\) is 0x200, \(D = 2\), \(S = 4\), then \(D(R_b,R_i,S)\) would designate the address 2 + 0x100 + 0x200 * 4, which is 0x902.

    (Denote: D is short for displacement, and S is short for scaling)

  • The MOV grammar:

    • Move an immediate value to the register:

      mov $0x4, %rax

      This would move the immediate value 0x4 into the register %rax

    • Move an immediate value to the memory:

      mov $-147, (%rax)

      In this case, there is an address stored in the register %rax and it would move the immediate value -147 to that address in the memory

    • Move a value from a register to register

      mov %rad, %rdx

      It would copy the value stored in register %rad to register %rdx

    • Move a value from a register to memory

      mov %rax, (%rdx)

      It would copy the value stored in register %rax to the address (%rdx) in the memory

    • Move a value from memory to register

      mov (%rax), %rdx

      vice versa

    Note: you can't move a value from memory directly to memory

    Observation:

    ​ This operation simply move something into a register or the memory. If it is a register, the second argument should be the name of the register. If it is the memory, then the second argument should be the address, usually designated by (the name of a register) or D(Rb,Ri,S).

    ​ The first argument specifies the number we want to move. The first argument could be an immediate number, the name of a register or an address. If it is an immediate value, then it means it would move this literal. If it is the name of a register, then it means it would move the number stored in the register. If it is an address, it means it would move the number stored in this address.

  • The LEA grammar

    The general form:LEA first_argument, second_argument

    The first argument is an address, the second argument must be the name of a register.

    This operation will move the address (the first argument) itself to the second argument (a register or an address).

    • The difference between LEA and MOV: LEA would simply move the address itself while MOV would dereference the address and move the value stored in that address

​ In practice, the first argument can be literal or the name of a register. This might violate the rule, but compiler still use this, especially in some arithmetic operation.

Can't understand MOV and LEA? Watch this video: 4 MOV vs LEA (youtube.com)

  • In a function, the first parameter passed to the function will be stored in register %rdi, the second parameter would be stored in %rsi. The return value of a function would be stored in register %rax

  • movl instruction:

    This instruction could move a 32-bit value from one place to another.

    • From register to register:

      movl %eax, %ebx

      This would move the value in %eax to %ebx

      (Note: %eax is the lower 32 bits of %rax, so does %ebx)

    • From memory to register

      movl 0x100(%rbp), %eax

      This would move the 32-bit value from memory location at 0x100(%rbp) to register %eax. This would load 4 memory units and would load it upside down (because we are talking about little-endian machine)

      For example, here are the number stored in memory:

      Memory Location Value
      0x100 0x17
      0x101 0x25
      0x102 0x33
      0x103 0x64
      0x104 0x89
      0x105 0x26
      0x106 0x71

      If %rbp = 0, then after this operation, 0x64332517 would be stored in register %eax.

    • From register to memory

      Similar, so omit it.

    • Move a literal to memory & Move a literal to register

      Similar, so omit it.

    Note: After the movl operation, the upper-32 bits of the destination(could be register or memory location) would be set to 0

  • A general property for instructions copying and generating 1-, 2-, 4-, 8- byte values: When the destination is a register, those that generates 1-,2- byte values leaves the remaining upper bytes unchanged, while those that generate 4- byte values will set the upper 4 byte to 0.

  • The program counter(commonly called %rip) indicates the address in memory of the next instruction to be executed.

  • The regular movq operation can only have immediate source operand that can be represented in 32-bit two's complement (Simply put, the immediate source operand must be in the range of \([-2147483648, 2147483647])\). Then the compiler would do sign extension to the 32-bit two's complement to make it 64-bit, then carry out the move.

    What is sign extension?

    ​ Given a number represented in short-word-length two's complement, we want to represent this number in a long-word-length two's complement without changing its arithmetic value. This transformation is called sign extension.

    ​ We can finish this task by replicating the sign digit. Whether the number is positive or not, you will find by doing the replication the value remain unchanged.

    ​ For example, 0x0011 after sign extension becomes 0x000011, the value remains unchanged. (before: 3; after: 3)

    ​ Another example: 0x1100 after sign extension becomes 0x111100, and the value remains unchanged (before: -4; after: -32+16+8+4 = -4;)

  • As mentioned above, the regular movq instruction has limitation on literal, that's why the instructionmovabsq comes into life. It can move arbitrary 64-bit two's complement literal, and the destination must be a register.

  • When copying from a smaller source to a larger destination (for example, from a 32-bit to 64-bit), the x86-64 ISA provides two instructions: movz and movs.

    movz would set the remaining higher bits to 0, while movs would fill the remaining higher bits by replicating the most significant bits of the source operand.

    Here are some example:

​ The operation cltq has the same effect of the operation movslq, but it has more compact encoding.

cltq is short for `convert l(double word) to q(quad word)

  • The unary operation:

    It has a single operand serving as both resource and destination. For example, we could have:

1
2
incq(%rax) # to increment register %rax by 1
decq(%rax) # to decrement register %rax by
  • The arithmetic binary operation:
1
2
3
4
addq %rcx, %rax   # to increment register %rax by the value stored in %rcx
subq %rcx, %rax # to decrement register %rax by the value stored in %rax
mulq %rcx, %rax # rax <---- (rax * rcx) (value stored in rax and rcx are considered to be unsigned)
imulq $16, (%rax,%rdx,8) # to multiply the address %rax + 8 * %rdx by a factor of 16 (value involved are considered to be two's complement)

For these binary operations, there are two observations:

  • The source would be the first argument, and the destination would be the second argument.
  • The first operand could be an immediate number, a register or a memory location. The second operand could be an register or a memory location. However, the first operand and the second operand could not both be memory location.
  • Shift operation:

    The operation involves two operand, the first operand specifies the shift amount, and the second value designates the value we want to shift. Shift amount usually is a immediate number, but it can also be the value stored in single-byte register %cl.

    A shift operation on the data type that are w bits long would use the lower m bits of the register %cl, where \(2^m = w\). For example, if the register %cl stores the hexadecimal value of $0XFF, then SALB would shift by 7 digits, SALW would shift by 15 digits, SALL would shift by 31 digits, and SALQ would shift by 63 digits.

    • Left shift:

      Comprises two instructions: SAL and SHL. These instructions have same effect

    • Right shift:

      Comprises two instructions: SAR and SHR. The former is the arithmetic right shift, and the latter is the logical right shift.

  • Single Operand Multiplication:

    ​ This types comprise two operations: imulq S and mulq S. The former one is for two's complement, and the latter one is for unsigned. Apart from this difference, the rest are the same, so we would only talk about imulq S.

    ​ The operation imulq S would calculate the value \(S \times M[\%rax]\) keep the all 128-bit without truncate. Then it would store the higher 64 bits to register %rdx and store the lower 64 bits to register %rax.

  • cqto operation:

    This would convert a 64-bit value to 128-bit value. Specifically, it would sign extend %rax to make it to be a 128-bit value, and then store the upper 64 bit in %rdx and store the lower-64 bit in %rax

  • divide operation:

    There are divide operation: idivq and divq. Similarly, former one is for two's complement, and the latter one is for unsigned. Let's just talk about idivq since divq is the same case.

    Syntax for idivq: idivq S

    • case 1: If the bits of $rdx are all set to the sign bit of $rax (for operation divq S, we should look at whether the bits of $rdx are all set to zero), then we would consider that the dividend is a 64-bit value stored in $rax.
    • case 2: If the bits of $rdx are not all set to the sign bit of $rax (for operation divq S, we should look at whether the bits of $rdx are all set to zero), then we would consider that the dividend is a 128-bit value, with higher 64 bits stored in $rdx and lower 64 bits stored in $rax

    Anyway, after specifying the dividend, then the machine would do the division, and yield a quotient and a remainder.

    The quotient would be stored in $rax and the remainder would be stored in $rdx.

CSAPP lecture 6

1. Condition code

  • Four condition codes:

    • CF: Carry Flag. Will be 1 if there is a carry out of the most significant bit. Used to detect the overflow of the unsigned number
    • ZF: Zero Flag. Will be 1 if the most recent operation yield 0.
    • SF: Sign Flag. Will be 1 if the most recent operation yield a value with the sign bit being 1.
    • OF: Overflow Flag. Will be 1 if the most recent operation caused an overflow (for two's complement), either positive overflow or negative overflow.

    The leaq operation would not alter any of the condition code. All other arithmetic operation would make these condition code to change.

  • For the logical operation (such as xor or add), they will leave the CF and OF to be 0.

  • For the shift operation, the CF will be set to the last digit shifted out, and the OF will be set to 0.

  • For the inc and dec operation, they will set the OF and ZF, but they will leave the CF unchanged.

  • There are two instruction classes which can only alter the condition codes without changing the other value.

​ We can check the sign of a value or whether this value is 0 by test %rax %rax. (Suppose this value is stored in register %rax)

2. Access the condition code

  • Access the condition code: Rather than directly access the condition code, we prefer to access the condition code indirectly, using the following 3 methods:
    • We can set a single byte to 0 or 1, depending on some combination of the condition codes.
    • We can conditionally jump to some part of the program
    • We can conditionally transfer data.
2.1 How to set a single byte according to the condition code

​ These set class of instruction have a single operand D indicating the destination. The destination have to be a register or a memory location. If it is a register, then it would set the least significant byte of the register to 1 (usually this register have to be %al kind rather than %rax kind.)

​ an example:

1
2
3
4
5
comp:
cmpq %rsi %rdi
setl %al
movzbq %al %rax
ret
2.2 (Conditionally) jump to some part of the program

Unconditionally jump:

1
2
3
4
5
	movq $0, %rax
jmp .L1
movq (%rax), %rdx
.L1:
popq %rdx

The jmp instruction would jump to .L1 label

jmp is the unconditional jump operation. It can jump directly or indirectly. When it is directly, it's something like jmp .L1. When it is indirectly, it is something like jmp *%rax or jmp *(%rax), where the label is stored in the register %rax or the memory location (%rax)

Conditional jump:

2.3 Jump instruction encodings

​ In assembly code, to specify where we want to jump to we can simply write something like .L1 happily. In the disassembled machinery code (the assemble code translated from the machinery code), the jump operation would written in the form like jmp 0x1137. There will be an address operand like 0x1137 to specify which operation will the program jump to. In this case, the jump operation would jump to 0x1137 and do the operation in 0x1137

​ But in machinery code, how to specify where we want to jump to? There are two ways to do this, first is so called PC relatives. That is saying, they encode the target address as the difference between the target address and the address of the instruction immediately following the jump instruction.

The difference between a and b is a - b

​ The reason why we try to use this kind of notation is that the jump target would remain unchanged when the whole program is shifted to another address.

1
2
eb 03 # In this case, 0x03 is the encoded jump target. 0x03 has to be 							added by the address of the next instruction to get the 								target address. 
# "eb 03" is a machine code in binary. "eb" specifies the type of the operation and "03" specifies the encoded jump target

Note: this 0x03 is a two's complement number, so for example, if now it is 0xFF, then it is actually -1, and the actual jump target would be the address of the next instruction minus 1.

Question

​ Sometimes we would see operation jmp .L1 and sometimes we would see operation jmp 0x1137. Which one (probably both) is the assembly code? What's difference between these two command?

2.4 Conditional Move

​ Apart from the jump operation, we have an operation called conditional move.

1
2
3
4
5
6
7
long comp(long x,long y) {
long res;
if (x < y)
res = y - x; else
res = x - y;
return res;
}

For example, this C code might be translated into the following assembly code:

1
2
3
4
5
6
7
8
absdiff:
movq %rsi, %rax
subq %rdi, %rax
movq %rdi, %rdx
subq %rsi, %rdx
cmpq %rsi, %rdi
cmovge %rdx, %rax #If the last comparison said "%rdi >= %rsi", then this move will conditional move will apply. Otherwise nothing will happen.
ret

Question

​ Why will the conditional move operation reduce the penalty for misprediction? It also need to decide whether to do the move instruction, and the misprediction should affect the following pipeline as well.

The class of conditional move:

​ Some property for the conditional move:

  • The destination must be register
  • The data type can be 2,4,8 bytes long, but can't be 1 byte long
  • The instruction don't need a suffix to specify the data type. The processor could infer the operand length from the name of the destination register.

The condition move may not be valid if one expression is invalid.

for example

1
2
3
4
int *p;
if (!p)
return 0; else
reutnr *p;

You can not calculate *p beforehand as p may be a null pointer, making the dereferencing invalid.

\[ \int\sqrt{a^2 - x^2}dx = \frac x2\sqrt{a^2 - x^2} + \frac {a^2}2\arcsin(\frac xa) + C \] \[ \int \frac 1{\sqrt{x^2 \pm a^2}}dx = \ln|x + \sqrt{x^2 \pm a^2}| + C \] \[ \int \sqrt{x^2 \pm a^2}dx = \frac x2 \sqrt{x^2 \pm a^2} \pm\frac {a^2}2\ln|x + \sqrt{x^2 \pm a^2}| + C \]

1. Git本地基本操作

0. 一些英语翻译
  • stage : 暂存区
  • index: 暂存区
  • working tree: 当前HEAD指向的版本
  • working directory: 工作区
1.创建本地仓库

​ 随便新建一个文件夹,然后执行

1
$ git init

​ 于是这个文件夹就变成了一个本地仓库(会出现一个叫做.git的隐藏文件夹)

2. 一些术语
  • 工作区:就是你在电脑中能看到的目录(即那个文件夹)
  • 暂存区:简单理解的话,就是文件被add之后就会被复制一份并且存储在暂存区
  • github本地仓库:简单理解的话,就是commit之后文件去的地方
  • 版本库:包含了暂存区和本地仓库
3. 四个状态,一些操作
  • 四个状态:untrackedunmodifiedmodifiedstaged

  • 在工作区中新建一个文件之后,输入git status,会发现这个文件是untracked状态
  • 当我们add了这个文件之后,这个文件的状态变为staged,进入了暂存区
  • 然后我们commit这个文件,完成commit之后这个文件库中的文件和本地文件又变为一致,此时这个文件状态变为unmodified

  • 当我们在add完文件之后再对这个文件进行修改,此时这个文件会变成modified状态,并且使用git status命令时会提示我们这个文件被修改了。此时我们可以使用git checkout -- 文件名操作将这个文件回溯为暂存区中的版本。
  • 当我们add并且commit完文件之后再对这个文件进行修改,那么这个文件会变成modified状态。此时我们依旧可以使用git checkout -- 文件名操作将这个文件回溯为本地库中的版本。
  • 如果本地库中的文件版本与工作区的不同,即使我们add了工作区中的文件,这个文件的状态依旧是modified。此时如果我们对这个文件进行checkout,这个文件的内容不会有任何改变。
  • 如果工作区的文件,暂存区的文件和本地库中的文件版本各不相同,那么git checkout -- 文件名会把工作区的文件回溯为暂存区中的版本。

一言以蔽之,在这个文件addcommit之后再做修改,这个文件的状态就会变成modified。然后如果对这个文件执行git checkout --,就会把这个文件回溯到最近一次git commitgit add时候的状态。

4. git log命令

​ 输入git log之后进入一个页面,可以按回车不停往下翻;可以按下键盘上的Q键退出日志;

git log中最重要的部分在于它会显示每个提交的hash值。

5. 如何查看Git中的文件内容和文件目录?
  • 如何查看工作区的文件内容&文件目录? 用文件资源管理器 或 分别用lscat命令

  • 如何查看本地仓库中的文件目录?

    1
    $ git ls-tree HEAD
  • 如何查看本地仓库中的某个文件内容?

    1
    $ git show <commit>:<file>   # <commit>是提交的hash值或分支名,<file>是文件名

    1
    $ git show HEAD:<file>      # 直接查看当前分支中某个文件的内容
  • 如何查看暂存区中的文件目录?

    1
    $ git ls-files
  • 如何查看暂存区中的文件内容?

    1
    $ git show :<file>
6. Git的撤销与回退

git reset实现版本回退:

  • 版本回退可以用git reset命令。
1
2
3
4
$ git reset HEAD^    		# 撤销到上一个版本
$ git reset HEAD^^ # 撤销到上两个版本
$ git reset HEAD~100 # 撤销到上100个版本
$ git reset <commit hash> # 撤销至某一个提交(commit hash是git log中的那串16进制数字,不需要全部输完,只需要输前几个数字就可以了)
  • git reset可以带一些参数,一般有三种版本:

    • mixed,即默认版本,最常用,执行之后移动HEAD指针,同时回溯暂存区,但不动工作区
    • soft,使用方法举例:git reset --soft HEAD^,执行之后移动HEAD指针,不动暂存区和工作区。
    • hard,使用方法举例:git reset --hard HEAD^,执行之后即移动HEAD指针,又动暂存区,还动工作区
  • git reset之后发现reset错了,想回到reset之前的较新的版本,怎么办?

    没事,只要找到较新的那个版本的commit hash,然后还是用git reset <commit hash>就可以了(这里的<commit hash>是那个较新版本的commit hash

  • git reset之后发现reset错了,而且还手滑关掉了命令行窗口,重新打开之后输入git log发现找不到那个较新版本的commit hash,怎么办?

    没事,输入git reflog就可以看到之前提交的commit hash了。


git reset也可以实现把已经添加到暂存区的对于某个文件的修改撤销掉:

1
$ git reset HEAD <file_name>
7. 文件的删除与恢复
  • 删除工作区的文件:rm <filename>

  • 删除完工作区的文件之后如果这个文件之前被add或者commit过,那么可以

    git checkout -- <filename>来恢复。


  • 先执行rm <filename>,再执行git rm <filename>,那么执行完之后这个文件在工作区和暂存区中都被删掉了

  • 这个时候可以:先git reset HEAD <filename>将文件从本地仓库恢复到暂存区,然后再

    git checkout -- <filename>将文件从暂存区恢复到工作区


  • 如果直接执行git rm <filename>,然后git commit -m "message",那么文件会在工作区中保留,然后再暂存区和本地仓库中都消失。

2. Git远程库

1.建立SSH连接

​ 这一步是要让本地PC与远程github建立SSH连接,具体方法为:

  • 打开powershell管理员模式,输入

    1
    $ ssh-keygen -t rsa -C "youremail@example.com"

    注意:这里的email地址就写注册github时候用的就可以

  • 在C盘用户主目录下找到.ssh文件夹,进入,打开id_rsa.pub,复制其中的内容

  • 登录github打开“Account settings”,“SSH Keys”页面:

然后,点“Add SSH Key”,填上任意Title,在Key文本框里粘贴id_rsa.pub文件的内容:

  • 点“Add Key”,你就应该看到已经添加的Key:
2. 连接远程仓库

在本地仓库中git bash here,然后在git中输入:

1
$ git remote add origin <ssh>

这里的<ssh>是github上的仓库中的ssh端口地址

然后这里的origin是我们给远程库设置的名字

3. 将本地库的内容推送到远程仓库
  • 第一次推送的时候用这个命令:
1
$ git push -u origin master

​ 注:git push命令实际上是把当前分支master推送到远程

​ 由于远程库是空的,我们第一次推送master分支时,加上了-u参数,Git不但会把本地的master分支内容推送的远程新的master分支,还会把本地的master分支和远程的master分支关联起来,在以后的推送或者拉取时就可以简化命令。

  • 之后推送的时候就直接:
1
$ git push origin master

​ 可以把本地master分支的最新修改推送至GitHub

4. 断开本地库和远程库之间的连接
1
$ git remote rm origin

用这个命令可以断开本地库和远程库之间的连接

3.在Vscode中使用Git

1.可能会遇到的第一个问题

​ 当你打开vscode的source control模块时,vscode可能会认为你还没有安装git,并建议你安装git。可是你明明已经安装过了git,于是你觉得很奇怪。这里的原因在于你还没有更改vscode的路径。具体方法如下:

  • 打开vscode,按下ctrl + ,,在settings中打开搜索栏,搜索git.path
  • 点击Edit in settings.json文件,进入这个json文件
  • 复制你的git.exe的路径,它应该长成D:\application\Git\bin\git.exe这样。把这个路径复制到那个json文件"git.path"后面
  • 把你复制的路径中的\全部改成\\,不然会报错
  • 最终应该长成这样:"git.path": "D:\\application\\Git\\bin\\git.exe"(你的路径肯定跟我不一样)
  • 重启vscode
0%