filecoin探索之路:复制证明(一)

Tutorial

前言

在filecoin的使用中,我们总是避免不了会碰到各种运维的情况,而最好的运维方式,就是了解代码的实现,当我们明白了工作的原理后,在日常的使用中就会得心应手。接下来的日子,我将会基于lotus 1.16版本进行代码解析,带领读者层层解剖lotus。

一、复制证明简介 引用官方的解释就是:“In order to register a sector with the Filecoin network, the sector has to be sealed. Sealing is a computation-heavy process that produces a unique representation of the data in the form of a proof, called Proof-of-Replication or PoRep.” 简单来说,复制证明(PoRep)就是在对扇区进行封装的过程中生成的扇区唯一标识。

复制证明要用到三种特殊参数:数据本身、执行密封的矿工参与者、特定矿工密封特定数据的时间。一旦其中的一个参数发生变化,那么得到的复制证明结果将会完全不同。换句话说,如果同一个矿工稍后试图密封相同的数据,那么这将导致不同的 PoRep 证明。

复制证明是一个很大的计算过程,接下来我将会分为两部分:P1、P2,从代码的形式给读者介绍复制证明的工作原理。

二、P1代码解析 在本次文章,我将主要介绍32GB封装的P1的过程。在此阶段,会发生PoRep 的SDR 编码和复制。

因为是第一次,我这里提一句,扇区的不同状态会触发miner不同的执行方法,1.16版本可以看extern\storage-sealing\fsm.go文件约460行代码内容,代码中记录了miner不同的状态以及触发方法。这里我只放P1状态的代码。

...
...
case Packing:
        return m.handlePacking, processed, nil
case GetTicket:
        return m.handleGetTicket, processed, nil
case PreCommit1:
        return m.handlePreCommit1, processed, nil
case PreCommit2:
        return m.handlePreCommit2, processed, nil
...
...

可以看到,PreCommit1调用的是handlePreCommit1方法,从下边可以看出,利用SealPreCommit1方法得到P1结果(P1结果将会交给P2处理)。

func (m *Sealing) handlePreCommit1(ctx statemachine.Context, sector SectorInfo) error {

        ...
        ...
        pc1o, err := m.sealer.SealPreCommit1(sector.sealingCtx(ctx.Context()), m.minerSector(sector.SectorType, sector.SectorNumber), sector.TicketValue, sector.pieceInfos())
        if err != nil {
                return ctx.Send(SectorSealPreCommit1Failed{xerrors.Errorf("seal pre commit(1) failed: %w", err)})
        }
        return ctx.Send(SectorPreCommit1{
                PreCommit1Out: pc1o,
        })
}

让我们深入看一下SealPreCommit1方法(如下代码),这里我们最终调用的是: func (sb *Sealer) SealPreCommit1(...) 方法。方法中有我们常常遇到的方法:AcquireSector(...)、Unpadded()。

AcquireSector方法是根据传入的类型(unsealed、cache、sealed等)与sectorID一起,组合成对应的path。

Uppadded方法是返回一个Piece的未填充大小(实际大小),以字节为单位,计算公式是: s - (s / 128)。有未填充大小,自然就有填充大小,填充大小的计算方法为 Padded(),计算公式是:s + (s / 127)


func (sb *Sealer) SealPreCommit1(ctx context.Context, sector storage.SectorRef, ticket abi.SealRandomness, pieces []abi.PieceInfo) (out storage.PreCommit1Out, err error) {
        paths, done, err := sb.sectors.AcquireSector(ctx, sector, storiface.FTUnsealed, storiface.FTSealed|storiface.FTCache, storiface.PathSealing)
        if err != nil {
                return nil, xerrors.Errorf("acquiring sector paths: %w", err)
        }
        ...
        ...
        ...
        var sum abi.UnpaddedPieceSize
        for _, piece := range pieces {
                sum += piece.Size.Unpadded()
        }
        // 根据扇区证明类型获取扇区大小
        ssize, err := sector.ProofType.SectorSize()
        if err != nil {
                return nil, err
        }
        // 这里比较一次总piece大小和要求的扇区大小是否一致
        ussize := abi.PaddedPieceSize(ssize).Unpadded()
        if sum != ussize {
                return nil, xerrors.Errorf("aggregated piece sizes don't match sector size: %d != %d (%d)", sum, ussize, int64(ussize-sum))
        }
        // TODO: context cancellation respect
        p1o, err := ffi.SealPreCommitPhase1(
                sector.ProofType,
                paths.Cache,
                paths.Unsealed,
                paths.Sealed,
                sector.ID.Number,
                sector.ID.Miner,
                ticket,
                pieces,
        )
        ...
        ...
}

接下来,一切准备就绪, 我们将要开始我们的P1远游了,因为接下来的代码都不属于lotus,上面方法中我们可以看到ffi.SealPreCommitPhase1,ffi其实使用的是https://github.com/filecoin-project/filecoin-ffi库,我们通过这个库的如下方法,转入rust语言去实现P1(这里我跳过中间的一个调用,不要在意,反正那个方法没啥特别的)。

func SealPreCommitPhase1(registeredProof RegisteredSealProof, cacheDirPath SliceRefUint8, stagedSectorPath SliceRefUint8, sealedSectorPath SliceRefUint8, sectorId uint64, proverId *ByteArray32, ticket *ByteArray32, pieces SliceRefPublicPieceInfo) ([]byte, error) { resp := C.seal_pre_commit_phase1(registeredProof, cacheDirPath, stagedSectorPath, sealedSectorPath, C.uint64_t(sectorId), proverId, ticket, pieces) defer resp.destroy() if err := CheckErr(resp); err != nil { return nil, err } return resp.value.copy(), nil } C库其实就是ffi库自身的rust库,调用的方法如下所示:

fn seal_pre_commit_phase1( registered_proof: RegisteredSealProof, cache_dir_path: c_slice::Ref, staged_sector_path: c_slice::Ref, sealed_sector_path: c_slice::Ref, sector_id: u64, prover_id: &[u8; 32], ticket: &[u8; 32], pieces: c_slice::Ref, ) -> repr_c::Box { catch_panic_response("seal_pre_commit_phase1", || { let public_pieces: Vec = pieces.iter().map(Into::into).collect(); let result = seal::seal_pre_commit_phase1( registered_proof.into(), as_path_buf(&cache_dir_path)?, as_path_buf(&staged_sector_path)?, as_path_buf(&sealed_sector_path)?, *prover_id, SectorId::from(sector_id), *ticket, &public_pieces, )?; let result = serde_json::to_vec(&result)?; Ok(result.into_boxed_slice().into()) }) } 上面的seal库是:https://github.com/filecoin-project/rust-filecoin-proofs-api 。在这个方法对应的文件(src/seal.rs)中,我们可以看到很多方法都对应了一个 *__inner方法。实际上seal_pre_commit_phase1只是做了个中转。我们可以直接看seal_pre_commit_phase1_inner方法

pub fn seal_pre_commit_phase1<R, S, T>( registered_proof: RegisteredSealProof, cache_path: R, in_path: S, out_path: T, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result where R: AsRef, S: AsRef, T: AsRef, { ensure!( registered_proof.major_version() == 1, "unusupported version" ); with_shape!( u64::from(registered_proof.sector_size()), seal_pre_commit_phase1_inner, registered_proof, cache_path.as_ref(), in_path.as_ref(), out_path.as_ref(), prover_id, sector_id, ticket, piece_infos ) } 在inner方法中,filecoin_proofs_v1::seal_pre_commit_phase1,会调用证明子系统的实现部分。filecoin_proofs_v1使用的库是:https://github.com/filecoin-project/rust-fil-proofs

fn seal_pre_commit_phase1_inner<Tree: 'static + MerkleTreeTrait>( registered_proof: RegisteredSealProof, cache_path: &Path, in_path: &Path, out_path: &Path, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result { let config = registered_proof.as_v1_config(); let output = filecoin_proofs_v1::seal_pre_commit_phase1::<_, _, _, Tree>( config, cache_path, in_path, out_path, prover_id, sector_id, ticket, piece_infos, )?; let filecoin_proofs_v1::types::SealPreCommitPhase1Output:: { labels, config, comm_d, } = output; Ok(SealPreCommitPhase1Output { registered_proof, labels: Labels::from_raw::(registered_proof, &labels)?, config, comm_d, }) } filecoin_proofs_v1::seal_pre_commit_phase1方法就是真正实现P1的地方,我将会在这里详细讲解P1,使P1将在这里一一浮出水面。

pub fn seal_pre_commit_phase1<R, S, T, Tree: 'static + MerkleTreeTrait>( porep_config: PoRepConfig, cache_path: R, in_path: S, out_path: T, prover_id: ProverId, sector_id: SectorId, ticket: Ticket, piece_infos: &[PieceInfo], ) -> Result<SealPreCommitPhase1Output> where R: AsRef, S: AsRef, T: AsRef, { info!("seal_pre_commit_phase1:start: {:?}", sector_id); // Sanity check all input path types. ensure!( metadata(in_path.as_ref())?.is_file(), "in_path must be a file" ); ensure!( metadata(out_path.as_ref())?.is_file(), "out_path must be a file" ); ensure!( metadata(cache_path.as_ref())?.is_dir(), "cache_path must be a directory" ); let sector_bytes = usize::from(PaddedBytesAmount::from(porep_config)); fs::metadata(&in_path) .with_context(|| format!("could not read in_path={:?})", in_path.as_ref().display()))?; fs::metadata(&out_path) .with_context(|| format!("could not read out_path={:?}", out_path.as_ref().display()))?; // Copy unsealed data to output location, where it will be sealed in place. fs::copy(&in_path, &out_path).with_context(|| { format!( "could not copy in_path={:?} to out_path={:?}", in_path.as_ref().display(), out_path.as_ref().display() ) })?; let f_data = OpenOptions::new() .read(true) .write(true) .open(&out_path) .with_context(|| format!("could not open out_path={:?}", out_path.as_ref().display()))?; // Zero-pad the data to the requested size by extending the underlying file if needed. f_data.set_len(sector_bytes as u64)?; let data = unsafe { MmapOptions::new() .map_mut(&f_data) .with_context(|| format!("could not mmap out_path={:?}", out_path.as_ref().display()))? }; let compound_setup_params = compound_proof::SetupParams { vanilla_params: setup_params( PaddedBytesAmount::from(porep_config), usize::from(PoRepProofPartitions::from(porep_config)), porep_config.porep_id, porep_config.api_version, )?, partitions: Some(usize::from(PoRepProofPartitions::from(porep_config))), priority: false, }; // 利用param得到public_params,其vanilla_params.graph字段,就是构建出来的图的数据结构。 let compound_public_params = <StackedCompound<Tree, DefaultPieceHasher> as CompoundProof< StackedDrg<'_, Tree, DefaultPieceHasher>, _, >>::setup(&compound_setup_params)?; trace!("building merkle tree for the original data"); let (config, comm_d) = measure_op(Operation::CommD, || -> Result<_> { let base_tree_size = get_base_tree_size::(porep_config.sector_size)?; let base_tree_leafs = get_base_tree_leafs::(base_tree_size)?; ensure!( compound_public_params.vanilla_params.graph.size() == base_tree_leafs, "graph size and leaf size don't match" ); trace!( "seal phase 1: sector_size {}, base tree size {}, base tree leafs {}", u64::from(porep_config.sector_size), base_tree_size, base_tree_leafs, ); let mut config = StoreConfig::new( cache_path.as_ref(), CacheKey::CommDTree.to_string(), default_rows_to_discard(base_tree_leafs, BINARY_ARITY), ); let data_tree = create_base_merkle_tree::<BinaryMerkleTree>( Some(config.clone()), base_tree_leafs, &data, )?; drop(data); config.size = Some(data_tree.len()); let comm_d_root: Fr = data_tree.root().into(); let comm_d = commitment_from_fr(comm_d_root); drop(data_tree); Ok((config, comm_d)) })?; trace!("verifying pieces"); ensure!( verify_pieces(&comm_d, piece_infos, porep_config.into())?, "pieces and comm_d do not match" ); let replica_id = generate_replica_id::<Tree::Hasher, _>( &prover_id, sector_id.into(), &ticket, comm_d, &porep_config.porep_id, ); let labels = StackedDrg::<Tree, DefaultPieceHasher>::replicate_phase1( &compound_public_params.vanilla_params, &replica_id, config.clone(), )?; let out = SealPreCommitPhase1Output { labels, config, comm_d, }; info!("seal_pre_commit_phase1:finish: {:?}", sector_id); Ok(out) } P1实现解释 上边seal_pre_commit_phase1的代码中,我们可以看到有三个path,这三个path分别对应: in_path -> unsealed path 、out_path -> sealed path、cache_path -> cache path。代码会先去检查这三个path(这当然是首要的了,毕竟将要用到,怎么能没有呢),他们两个是文件,一个是文件夹。

检查完path后我们可以看到fs::copy方法,它将unsealed文件拷贝到了sealed文件中,完成封装(刚接触lotus那会还以为封装操作很高大上,看完才知道只是copy操作了一波)。

Copy完成后拿出sealed文件的数据,并利用.set_len()方法填充数据,使sealed数据达到证明类型配置规定的扇区大小。

setup_params() setup_params()方法利用证明类型配置构建启动参数。这里传入的参数为:扇区大小、分区数、证明类型id、证明类型版本。分区数可看https://github.com/filecoin-project/rust-filecoin-proofs-api/blob/23ae2893741829bddc29d7211e06c914bab5423c/src/registry.rs中的partitions()方法,在对应https://github.com/filecoin-project/rust-fil-proofs/blob/ec2ef88a17ffed991b64dc8d96b30c36b275eca0/filecoin-proofs/src/constants.rs得到具体值。我分析以32GB扇区为主,因此分区数为10。另外三个就不讲了,跟分区数一样,都是从这两个文件得到的。

pub fn setup_params( sector_bytes: PaddedBytesAmount, partitions: usize, porep_id: [u8; 32], api_version: ApiVersion, ) -> Resultstacked::SetupParams { // 得到挑战层数和最大挑战次数 let layer_challenges = select_challenges( partitions, *POREP_MINIMUM_CHALLENGES .read() .expect("POREP_MINIMUM_CHALLENGES poisoned") .get(&u64::from(sector_bytes)) .expect("unknown sector size") as usize, *LAYERS .read() .expect("LAYERS poisoned") .get(&u64::from(sector_bytes)) .expect("unknown sector size"), ); let sector_bytes = u64::from(sector_bytes);

ensure!(
    sector_bytes % 32 == 0,
    "sector_bytes ({}) must be a multiple of 32",
    sector_bytes,
);

let nodes = (sector_bytes / 32) as usize;    // 节点数,SDR共有11层,每一层的节点数量相当于1GiB的字节数量。
let degree = DRG_DEGREE;    // 用于所有 DRG 图的基础度数, DRG_DEGREE=6。
let expansion_degree = EXP_DEGREE; //大小是8,上一层中抽取的节点数量,用来计算当前层的节点数据

Ok(stacked::SetupParams {
    nodes,
    degree,
    expansion_degree,
    porep_id,
    layer_challenges,
    api_version,
})

} Sealeddata's Merkle tree和对应comm_d的生成 看完setup_params方法,让我们继续看seal_pre_commit_phase1中的compound_public_params参数,这里实际上set_up的时候,将compound_setup_params参数的值赋予进去,并增加了一个至关重要的vanilla_params.graph字段,就是构造出来的图的数据结构

接下来我们可以看到seal_pre_commit_phase1方法的70行,这一段代码用于生成markle tree和comm_d

let (config, comm_d) = measure_op(Operation::CommD, || -> Result<_> { let base_tree_size = get_base_tree_size::(porep_config.sector_size)?; let base_tree_leafs = get_base_tree_leafs::(base_tree_size)?; ensure!( compound_public_params.vanilla_params.graph.size() == base_tree_leafs, "graph size and leaf size don't match" );

trace!(
    "seal phase 1: sector_size {}, base tree size {}, base tree leafs {}",
    u64::from(porep_config.sector_size),
    base_tree_size,
    base_tree_leafs,
);

let mut config = StoreConfig::new(
    cache_path.as_ref(),
    CacheKey::CommDTree.to_string(),
    default_rows_to_discard(base_tree_leafs, BINARY_ARITY),
);

// 创建默克尔树,根据其树根得到comm_d
let data_tree = create_base_merkle_tree::<BinaryMerkleTree<DefaultPieceHasher>>(
    Some(config.clone()),
    base_tree_leafs,
    &data,
)?;
drop(data);

config.size = Some(data_tree.len());
let comm_d_root: Fr = data_tree.root().into();
let comm_d = commitment_from_fr(comm_d_root);

drop(data_tree);

Ok((config, comm_d))

})?; 这里我们会生成tree store config,然后利用config、base_tree_leafs、sealed 填充数据,生成一个merkle tree。得到了merkle tree后就可以得到merkle tree的根。再利用merkle tree的根,通过commitment_from_fr算出comm_d。

生成副本id(replica_id) 当我们拿到了comm_d后,会利用verify_pieces方法验证一下comm_d,这个就不讲了,感兴趣的可以自己去看代码。

让我们看一下副本id是如何生成的

let replica_id = generate_replica_id::<Tree::Hasher, _>( &prover_id, sector_id.into(), &ticket, comm_d, &porep_config.porep_id, ); 利用数据本身生成得到了comm_d,这里再加上矿工id、扇区id、ticket,证明类型id。就能得到replica id值。

/// Generate the replica id as expected for Stacked DRG. pub fn generate_replica_id<H: Hasher, T: AsRef<[u8]>>( prover_id: &[u8; 32], sector_id: u64, ticket: &[u8; 32], comm_d: T, porep_seed: &[u8; 32], ) -> H::Domain { // 以链式方式处理输入数据。 let hash = Sha256::new() .chain_update(prover_id) .chain_update(&sector_id.to_be_bytes()) .chain_update(ticket) .chain_update(&comm_d) .chain_update(porep_seed) .finalize();

bytes_into_fr_repr_safe(hash.as_ref()).into()    //通过将 le_bytes 的最重要的两位归零,将 32 字节的切片(小端,非蒙哥马利形式)转换为 Fr::Repr。

} 生成labels 接下来就是P1最后的操作:生成labels。将public_params、复制id(replica id)和 tree store config 作为参数传入。

pub fn replicate_phase1( pp: &'a PublicParams, replica_id: &<Tree::Hasher as Hasher>::Domain, config: StoreConfig, ) -> Result<Labels> { info!("replicate_phase1");

let labels = measure_op(Operation::EncodeWindowTimeAll, || {
    Self::generate_labels_for_encoding(&pp.graph, &pp.layer_challenges, replica_id, config)
})?
.0;

Ok(labels)

} 这里可以看到,代码提取出了public_params的.graph字段,就是构造出来的图的数据结构,和public_params中包含挑战层数和最大挑战次数的layer_challenges。

接下来看generate_labels_for_encoding。这里可以分为多核与单核进行 SDR 编码,创建labels。

pub fn generate_labels_for_encoding( graph: &StackedBucketGraphTree::Hasher, layer_challenges: &LayerChallenges, replica_id: &<Tree::Hasher as Hasher>::Domain, config: StoreConfig, ) -> Result<(Labels, Vec)> { let mut parent_cache = graph.parent_cache()?;

#[cfg(feature = "multicore-sdr")]
{
    if SETTINGS.use_multicore_sdr {
        info!("multi core replication");
        create_label::multi::create_labels_for_encoding(
            graph,
            &parent_cache,
            layer_challenges.layers(),
            replica_id,
            config,
        )
    } else {
        info!("single core replication");
        create_label::single::create_labels_for_encoding(
            graph,
            &mut parent_cache,
            layer_challenges.layers(),
            replica_id,
            config,
        )
    }
}

#[cfg(not(feature = "multicore-sdr"))]
{
    info!("single core replication");
    create_label::single::create_labels_for_encoding(
        graph,
        &mut parent_cache,
        layer_challenges.layers(),
        replica_id,
        config,
    )
}

} 我将生成label的地址放在这里,想看的可以去看一下,这里就不细讲了。

多核:https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/create_label/multi.rs

单核:https://github.com/filecoin-project/rust-fil-proofs/blob/master/storage-proofs-porep/src/stacked/vanilla/create_label/single.rs

三、总结 其实rust语言我接触不多,开始的时候看得有点头痛,最后也是硬着头皮啃下来的。

如有大佬认为文章有不对的地方,欢迎纠正。

参考文章:

1、https://spec.filecoin.io/#section-algorithms.pos.porep

作者:三峡星未来数据 https://www.bilibili.com/read/cv18486728?spm_id_from=333.999.0.0 出处:bilibili

Answers 0