程序架构设计思路,c语音课程设计的设计思路

1.关键字查重原理思路

决定做一个c++程序查重的系统，查重基本原则是跟据程序中关键字使用情况来判断。（每个程序的关键字使用情况都不太一样，我又仔细想了想：貌似两个相同功能的程序他的关键字使用情况应该是差不多的？）

whatever 我们需要统计各个关键字在程序中出现的次数(关键点)，再根据二者的各个关键字出现频率来计算程序相似度

统计在每个源程序中关键字出现的频度, 得到两个向量X1和X2，通过计算向量X1和X2的相对距离来判断两个源程序的相似性。
例如:
关键字      void int for char if else while do break class
程序1关键字频度 4 3 0 4 3 0 7     0 0 2
程序2关键字频度 4 2 0 5 4 0 5    2 0 1
X1=[4,3,0,4,3,0,7,0,0,2]
   X2=[4,2,0,5,4,0,5,2,0,1]
设s是向量X1和X2的相对距离

公式1

当X1=X2时，s=0, 反映出可能是同一个程序；s值越大，则两个程序的差别可能也越大。

2.程序功能设计

程序应分为 4部分：

1. 欲查重的两份程序文件的读取（初级功能，一个大的查重系统也是这种原理，即：一个被查文件，一个对比数据库）。

2. 存储各个文件中关键字的字频。

3. 跟据储存的字频计算程序相似系数 s 。

4. 输出结果。包括两个程序关键字字频对比表，程序相似系数。

3.程序准备材料

1.c++ 关键字合集：

alignas (C++11 起)
alignof (C++11 起)
and
and_eq
asm
atomic_cancel (TM TS)
atomic_commit (TM TS)
atomic_noexcept (TM TS)
auto(1)
bitand
bitor
bool
break
case
catch
char
char8_t (C++20 起)
char16_t (C++11 起)
char32_t (C++11 起)
class(1)
compl
concept (C++20 起)
const
consteval (C++20 起)
constexpr (C++11 起)
const_cast
continue
co_await (C++20 起)
co_return (C++20 起)
co_yield (C++20 起)
decltype (C++11 起)
default(1)

delete(1)
do
double
dynamic_cast
else
enum
explicit
export(1)(3)
extern(1)
false
float
for
friend
goto
if
inline(1)
int
long
mutable(1)
namespace
new
noexcept (C++11 起)
not
not_eq
nullptr (C++11 起)
operator
or
or_eq
private
protected
public
reflexpr (反射 TS)

register(2)
reinterpret_cast
requires (C++20 起)
return
short
signed
sizeof(1)
static
static_assert (C++11 起)
static_cast
struct(1)
switch
synchronized (TM TS)
template
this
thread_local (C++11 起)
throw
true
try
typedef
typeid
typename
union
unsigned
using(1)
virtual
void
volatile
wchar_t
while
xor
xor_eq

摘自：https://zh.cppreference.com/w/cpp/keyword （如有侵权联删）

2. 哈希表存储关键字算法参考：

1.哈希表实现的统计关键字频度

2.统计文件中各单词出现的频率（Hash表实现）

3.使用动态规划算法实现文献查重（C/C++实现）

4.C++回顾统计词频问题 -- vector、map、hash_map（三种方式时间比较）

未完待续

------