filecoin探索之路：复制证明（一）

Question

前言在filecoin的使用中，我们总是避免不了会碰到各种运维的情况，而最好的运维方式，就是了解代码的实现，当我们明白了工作的原理后，在日常的使用中就会得心应手。接下来的日子，我将会基于lotus 1.16版本进行代码解析，带领读者层层解剖lotus。一、复制证明简介引用官方的解释就是：“In order to register a sector with the Filecoin network, the sector has to be sealed. Sealing is a computation-heavy process that produces a unique representation of the data in the form of a proof, called Proof-of-Replication or PoRep.” 简单来说，复制证明（PoRep）就是在对扇区进行封装的过程中生成的扇区唯一标识。复制证明要用到三种特殊参数：数据本身、执行密封的矿工参与者、特定矿工密封特定数据的时间。一旦其中的一个参数发生变化，那么得到的复制证明结果将会完全不同。换句话说，如果同一个矿工稍后试图密封相同的数据，那么这将导致不同的 PoRep 证明。复制证明是一个很大的计算过程，接下来我将会分为两部分：P1、P2，从代码的形式给读者介绍复制证明的工作原理。二、P1代码解析在本次文章，我将主要介绍32GB封装的P1的过程。在此阶段，会发生PoRep 的SDR 编码和复制。因为是第一次，我这里提一句，扇区的不同状态会触发miner不同的执行方法，1.16版本可以看extern\storage-sealing\fsm.go文件约460行代码内容，代码中记录了miner不同的状态以及触发方法。这里我只放P1状态的代码。 ``` ... ... case Packing: return m.handlePacking, processed, nil case GetTicket: return m.handleGetTicket, processed, nil case PreCommit1: return m.handlePreCommit1, processed, nil case PreCommit2: return m.handlePreCommit2, processed, nil ... ... ``` 可以看到，PreCommit1调用的是handlePreCommit1方法，从下边可以看出，利用SealPreCommit1方法得到P1结果（P1结果将会交给P2处理）。 ``` func (m *Sealing) handlePreCommit1(ctx statemachine.Context, sector SectorInfo) error { ... ... pc1o, err := m.sealer.SealPreCommit1(sector.sealingCtx(ctx.Context()), m.minerSector(sector.SectorType, sector.SectorNumber), sector.TicketValue, sector.pieceInfos()) if err != nil { return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("seal pre commit(1) failed: %w", err)}) } return ctx.Send(SectorPreCommit1{ PreCommit1Out: pc1o, }) } ``` 让我们深入看一下SealPreCommit1方法（如下代码），这里我们最终调用的是： func (sb *Sealer) SealPreCommit1(...) 方法。方法中有我们常常遇到的方法：AcquireSector(...)、Unpadded()。 AcquireSector方法是根据传入的类型（unsealed、cache、sealed等）与sectorID一起，组合成对应的path。 Uppadded方法是返回一个Piece的未填充大小（实际大小），以字节为单位，计算公式是： s - (s / 128)。有未填充大小，自然就有填充大小，填充大小的计算方法为 Padded()，计算公式是：s + (s / 127) ``` func (sb *Sealer) SealPreCommit1(ctx context.Context, sector storage.SectorRef, ticket abi.SealRandomness, pieces []abi.PieceInfo) (out storage.PreCommit1Out, err error) { paths, done, err := sb.sectors.AcquireSector(ctx, sector, storiface.FTUnsealed, storiface.FTSealed|storiface.FTCache, storiface.PathSealing) if err != nil { return nil, xerrors.Errorf("acquiring sector paths: %w", err) } ... ... ... var sum abi.UnpaddedPieceSize for _, piece := range pieces { sum += piece.Size.Unpadded() } // 根据扇区证明类型获取扇区大小 ssize, err := sector.ProofType.SectorSize() if err != nil { return nil, err } // 这里比较一次总piece大小和要求的扇区大小是否一致 ussize := abi.PaddedPieceSize(ssize).Unpadded() if sum != ussize { return nil, xerrors.Errorf("aggregated piece sizes don't match sector size: %d != %d (%d)", sum, ussize, int64(ussize-sum)) } // TODO: context cancellation respect p1o, err := ffi.SealPreCommitPhase1( sector.ProofType, paths.Cache, paths.Unsealed, paths.Sealed, sector.ID.Number, sector.ID.Miner, ticket, pieces, ) ... ... } ``` 接下来，一切准备就绪，我们将要开始我们的P1远游了，因为接下来的代码都不属于lotus，上面方法中我们可以看到ffi.SealPreCommitPhase1，ffi其实使用的是https://github.com/filecoin-project/filecoin-ffi库，我们通过这个库的如下方法，转入rust语言去实现P1（这里我跳过中间的一个调用，不要在意，反正那个方法没啥特别的）。 func SealPreCommitPhase1(registeredProof RegisteredSealProof, cacheDirPath SliceRefUint8, stagedSectorPath SliceRefUint8, sealedSectorPath SliceRefUint8, sectorId uint64, proverId *ByteArray32, ticket *ByteArray32, pieces SliceRefPublicPieceInfo) ([]byte, error) { resp := C.seal_pre_commit_phase1(registeredProof, cacheDirPath, stagedSectorPath, sealedSectorPath, C.uint64_t(sectorId), proverId, ticket, pieces) defer resp.destroy() if err := CheckErr(resp); err != nil { return nil, err } return resp.value.copy(), nil } C库其实就是ffi库自身的rust库，调用的方法如下所示： fn seal_pre_commit_phase1( registered_proof: RegisteredSealProof, cache_dir_path: c_slice::Ref, staged_sector_path: c_slice::Ref, sealed_sector_path: c_slice::Ref, sector_id: u64, prover_id: &[u8; 32], ticket: &[u8; 32], pieces: c_slice::Ref, ) -> repr_c::Box { catch_panic_response("seal_pre_commit_phase1", || { let public_pieces: Vec = pieces.iter().map(Into::into).collect(); let result = seal::seal_pre_commit_phase1( registered_proof.into(), as_path_buf(&cache_dir_path)?, as_path_buf(&staged_sector_path)?, as_path_buf(&sealed_sector_path)?, *prover_id, SectorId::from(sector_id), *ticket, &public_pieces, )?; let result = serde_json::to_vec(&result)?; Ok(result.into_boxed_slice().into()) }) } 上面的seal库是：https://github.com/filecoin-project/rust-filecoin-proofs-api 。在这个方法对应的文件（src/seal.rs）中，我们可以看到很多方法都对应了一个 *__inner方法。实际上seal_pre_commit_phase1只是做了个中转。我们可以直接看seal_pre_commit_phase1_inner方法 pub fn seal_pre_commit_phase1( registered_proof: RegisteredSealProof, cache_path: R, in_path: S, out_path: T, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result where R: AsRef, S: AsRef, T: AsRef, { ensure!( registered_proof.major_version() == 1, "unusupported version" ); with_shape!( u64::from(registered_proof.sector_size()), seal_pre_commit_phase1_inner, registered_proof, cache_path.as_ref(), in_path.as_ref(), out_path.as_ref(), prover_id, sector_id, ticket, piece_infos ) } 在inner方法中，filecoin_proofs_v1::seal_pre_commit_phase1，会调用证明子系统的实现部分。filecoin_proofs_v1使用的库是：https://github.com/filecoin-project/rust-fil-proofs 。 fn seal_pre_commit_phase1_inner( registered_proof: RegisteredSealProof, cache_path: &Path, in_path: &Path, out_path: &Path, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result { let config = registered_proof.as_v1_config(); let output = filecoin_proofs_v1::seal_pre_commit_phase1::<_, _, _, Tree>( config, cache_path, in_path, out_path, prover_id, sector_id, ticket, piece_infos, )?; let filecoin_proofs_v1::types::SealPreCommitPhase1Output:: { labels, config, comm_d, } = output; Ok(SealPreCommitPhase1Output { registered_proof, labels: Labels::from_raw::(registered_proof, &labels)?, config, comm_d, }) } filecoin_proofs_v1::seal_pre_commit_phase1方法就是真正实现P1的地方，我将会在这里详细讲解P1，使P1将在这里一一浮出水面。 pub fn seal_pre_commit_phase1( porep_config: PoRepConfig, cache_path: R, in_path: S, out_path: T, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result> where R: AsRef, S: AsRef, T: AsRef, { info!("seal_pre_commit_phase1:start: {:?}", sector_id); // Sanity check all input path types. ensure!( metadata(in_path.as_ref())?.is_file(), "in_path must be a file" ); ensure!( metadata(out_path.as_ref())?.is_file(), "out_path must be a file" ); ensure!( metadata(cache_path.as_ref())?.is_dir(), "cache_path must be a directory" ); let sector_bytes = usize::from(PaddedBytesAmount::from(porep_config)); fs::metadata(&in_path) .with_context(|| format!("could not read in_path={:?})", in_path.as_ref().display()))?; fs::metadata(&out_path) .with_context(|| format!("could not read out_path={:?}", out_path.as_ref().display()))?; // Copy unsealed data to output location, where it will be sealed in place. fs::copy(&in_path, &out_path).with_context(|| { format!( "could not copy in_path={:?} to out_path={:?}", in_path.as_ref().display(), out_path.as_ref().display() ) })?; let f_data = OpenOptions::new() .read(true) .write(true) .open(&out_path) .with_context(|| format!("could not open out_path={:?}", out_path.as_ref().display()))?; // Zero-pad the data to the requested size by extending the underlying file if needed. f_data.set_len(sector_bytes as u64)?; let data = unsafe { MmapOptions::new() .map_mut(&f_data) .with_context(|| format!("could not mmap out_path={:?}", out_path.as_ref().display()))? }; let compound_setup_params = compound_proof::SetupParams { vanilla_params: setup_params( PaddedBytesAmount::from(porep_config), usize::from(PoRepProofPartitions::from(porep_config)), porep_config.porep_id, porep_config.api_version, )?, partitions: Some(usize::from(PoRepProofPartitions::from(porep_config))), priority: false, }; // 利用param得到public_params，其vanilla_params.graph字段，就是构建出来的图的数据结构。 let compound_public_params = as CompoundProof< StackedDrg<'_, Tree, DefaultPieceHasher>, _, >>::setup(&compound_setup_params)?; trace!("building merkle tree for the original data"); let (config, comm_d) = measure_op(Operation::CommD, || -> Result<_> { let base_tree_size = get_base_tree_size::(porep_config.sector_size)?; let base_tree_leafs = get_base_tree_leafs::(base_tree_size)?; ensure!( compound_public_params.vanilla_params.graph.size() == base_tree_leafs, "graph size and leaf size don't match" ); trace!( "seal phase 1: sector_size {}, base tree size {}, base tree leafs {}", u64::from(porep_config.sector_size), base_tree_size, base_tree_leafs, ); let mut config = StoreConfig::new( cache_path.as_ref(), CacheKey::CommDTree.to_string(), default_rows_to_discard(base_tree_leafs, BINARY_ARITY), ); let data_tree = create_base_merkle_tree::>( Some(config.clone()), base_tree_leafs, &data, )?; drop(data); config.size = Some(data_tree.len()); let comm_d_root: Fr = data_tree.root().into(); let comm_d = commitment_from_fr(comm_d_root); drop(data_tree); Ok((config, comm_d)) })?; trace!("verifying pieces"); ensure!( verify_pieces(&comm_d, piece_infos, porep_config.into())?, "pieces and comm_d do not match" ); let replica_id = generate_replica_id::( &prover_id, sector_id.into(), &ticket, comm_d, &porep_config.porep_id, ); let labels = StackedDrg::::replicate_phase1( &compound_public_params.vanilla_params, &replica_id, config.clone(), )?; let out = SealPreCommitPhase1Output { labels, config, comm_d, }; info!("seal_pre_commit_phase1:finish: {:?}", sector_id); Ok(out) } P1实现解释上边seal_pre_commit_phase1的代码中，我们可以看到有三个path，这三个path分别对应： in_path -> unsealed path 、out_path -> sealed path、cache_path -> cache path。代码会先去检查这三个path（这当然是首要的了，毕竟将要用到，怎么能没有呢），他们两个是文件，一个是文件夹。检查完path后我们可以看到fs::copy方法，它将unsealed文件拷贝到了sealed文件中，完成封装（刚接触lotus那会还以为封装操作很高大上，看完才知道只是copy操作了一波）。 Copy完成后拿出sealed文件的数据，并利用.set_len()方法填充数据，使sealed数据达到证明类型配置规定的扇区大小。 setup_params() setup_params()方法利用证明类型配置构建启动参数。这里传入的参数为：扇区大小、分区数、证明类型id、证明类型版本。分区数可看https://github.com/filecoin-project/rust-filecoin-proofs-api/blob/23ae2893741829bddc29d7211e06c914bab5423c/src/registry.rs中的partitions()方法，在对应https://github.com/filecoin-project/rust-fil-proofs/blob/ec2ef88a17ffed991b64dc8d96b30c36b275eca0/filecoin-proofs/src/constants.rs得到具体值。我分析以32GB扇区为主，因此分区数为10。另外三个就不讲了，跟分区数一样，都是从这两个文件得到的。 pub fn setup_params( sector_bytes: PaddedBytesAmount, partitions: usize, porep_id: [u8; 32], api_version: ApiVersion, ) -> Result { // 得到挑战层数和最大挑战次数 let layer_challenges = select_challenges( partitions, *POREP_MINIMUM_CHALLENGES .read() .expect("POREP_MINIMUM_CHALLENGES poisoned") .get(&u64::from(sector_bytes)) .expect("unknown sector size") as usize, *LAYERS .read() .expect("LAYERS poisoned") .get(&u64::from(sector_bytes)) .expect("unknown sector size"), ); let sector_bytes = u64::from(sector_bytes); ensure!( sector_bytes % 32 == 0, "sector_bytes ({}) must be a multiple of 32", sector_bytes, ); let nodes = (sector_bytes / 32) as usize; // 节点数，SDR共有11层，每一层的节点数量相当于１GiB的字节数量。 let degree = DRG_DEGREE; // 用于所有 DRG 图的基础度数, DRG_DEGREE=6。 let expansion_degree = EXP_DEGREE; //大小是８，上一层中抽取的节点数量，用来计算当前层的节点数据 Ok(stacked::SetupParams { nodes, degree, expansion_degree, porep_id, layer_challenges, api_version, }) } Sealeddata's Merkle tree和对应comm_d的生成看完setup_params方法，让我们继续看seal_pre_commit_phase1中的compound_public_params参数，这里实际上set_up的时候，将compound_setup_params参数的值赋予进去，并增加了一个至关重要的vanilla_params.graph字段，就是构造出来的图的数据结构接下来我们可以看到seal_pre_commit_phase1方法的70行，这一段代码用于生成markle tree和comm_d let (config, comm_d) = measure_op(Operation::CommD, || -> Result<_> { let base_tree_size = get_base_tree_size::(porep_config.sector_size)?; let base_tree_leafs = get_base_tree_leafs::(base_tree_size)?; ensure!( compound_public_params.vanilla_params.graph.size() == base_tree_leafs, "graph size and leaf size don't match" ); trace!( "seal phase 1: sector_size {}, base tree size {}, base tree leafs {}", u64::from(porep_config.sector_size), base_tree_size, base_tree_leafs, ); let mut config = StoreConfig::new( cache_path.as_ref(), CacheKey::CommDTree.to_string(), default_rows_to_discard(base_tree_leafs, BINARY_ARITY), ); // 创建默克尔树，根据其树根得到comm_d let data_tree = create_base_merkle_tree::>( Some(config.clone()), base_tree_leafs, &data, )?; drop(data); config.size = Some(data_tree.len()); let comm_d_root: Fr = data_tree.root().into(); let comm_d = commitment_from_fr(comm_d_root); drop(data_tree); Ok((config, comm_d)) })?; 这里我们会生成tree store config，然后利用config、base_tree_leafs、sealed 填充数据，生成一个merkle tree。得到了merkle tree后就可以得到merkle tree的根。再利用merkle tree的根，通过commitment_from_fr算出comm_d。生成副本id(replica_id) 当我们拿到了comm_d后，会利用verify_pieces方法验证一下comm_d，这个就不讲了，感兴趣的可以自己去看代码。让我们看一下副本id是如何生成的 let replica_id = generate_replica_id::( &prover_id, sector_id.into(), &ticket, comm_d, &porep_config.porep_id, ); 利用数据本身生成得到了comm_d，这里再加上矿工id、扇区id、ticket，证明类型id。就能得到replica id值。 /// Generate the replica id as expected for Stacked DRG. pub fn generate_replica_id>( prover_id: &[u8; 32], sector_id: u64, ticket: &[u8; 32], comm_d: T, porep_seed: &[u8; 32], ) -> H::Domain { // 以链式方式处理输入数据。 let hash = Sha256::new() .chain_update(prover_id) .chain_update(§or_id.to_be_bytes()) .chain_update(ticket) .chain_update(&comm_d) .chain_update(porep_seed) .finalize(); bytes_into_fr_repr_safe(hash.as_ref()).into() //通过将 le_bytes 的最重要的两位归零，将 32 字节的切片（小端，非蒙哥马利形式）转换为 Fr::Repr。 } 生成labels 接下来就是P1最后的操作：生成labels。将public_params、复制id（replica id）和 tree store config 作为参数传入。 pub fn replicate_phase1( pp: &'a PublicParams, replica_id: &::Domain, config: StoreConfig, ) -> Result> { info!("replicate_phase1"); let labels = measure_op(Operation::EncodeWindowTimeAll, || { Self::generate_labels_for_encoding(&pp.graph, &pp.layer_challenges, replica_id, config) })? .0; Ok(labels) } 这里可以看到，代码提取出了public_params的.graph字段，就是构造出来的图的数据结构，和public_params中包含挑战层数和最大挑战次数的layer_challenges。接下来看generate_labels_for_encoding。这里可以分为多核与单核进行 SDR 编码，创建labels。 pub fn generate_labels_for_encoding( graph: &StackedBucketGraph, layer_challenges: &LayerChallenges, replica_id: &::Domain, config: StoreConfig, ) -> Result<(Labels, Vec)> { let mut parent_cache = graph.parent_cache()?; #[cfg(feature = "multicore-sdr")] { if SETTINGS.use_multicore_sdr { info!("multi core replication"); create_label::multi::create_labels_for_encoding( graph, &parent_cache, layer_challenges.layers(), replica_id, config, ) } else { info!("single core replication"); create_label::single::create_labels_for_encoding( graph, &mut parent_cache, layer_challenges.layers(), replica_id, config, ) } } #[cfg(not(feature = "multicore-sdr"))] { info!("single core replication"); create_label::single::create_labels_for_encoding( graph, &mut parent_cache, layer_challenges.layers(), replica_id, config, ) } } 我将生成label的地址放在这里，想看的可以去看一下，这里就不细讲了。多核：https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/create_label/multi.rs 单核：https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/create_label/single.rs 三、总结其实rust语言我接触不多，开始的时候看得有点头痛，最后也是硬着头皮啃下来的。如有大佬认为文章有不对的地方，欢迎纠正。参考文章： 1、https://spec.filecoin.io/#section-algorithms.pos.porep 作者：三峡星未来数据 https://www.bilibili.com/read/cv18486728?spm_id_from=333.999.0.0 出处：bilibili

filecoin探索之路：复制证明（一）

Answers 0