Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mainnet bootstrap strategy, how to get the power table? #596

Open
5 tasks
Kubuxu opened this issue Aug 27, 2024 · 12 comments
Open
5 tasks

Mainnet bootstrap strategy, how to get the power table? #596

Kubuxu opened this issue Aug 27, 2024 · 12 comments
Assignees
Labels

Comments

@Kubuxu
Copy link
Contributor

Kubuxu commented Aug 27, 2024

Option 1: Save the power table in a new field in the PowerTable actor during migration
Option 2: Bootstrap from chain lookback, oh-shit-store, initial power table cid snapshots, in first update after upgrade Lotus includes initial power table CID in binary.

Tasks

@Stebalien
Copy link
Member

So... I still want to do option 2, but option 1 is nice because it requires no coordination and it doesn't preclude option 2. It does require a small FIP update, but I don't expect it'll be that controversial.

The issue with option 2 is that the CAR "roots" are currently expected to be a tipset. Ideally, we'd have a single root metadata object pointing to the chain and whatever else we want, but... that's not what we have right now.

@Stebalien
Copy link
Member

So, I'd say go with option 1 and punt option 2 into the future.

@Stebalien Stebalien self-assigned this Sep 3, 2024
@Stebalien
Copy link
Member

Stebalien commented Sep 3, 2024

Proposal:

  1. Add a new Option<Cid> field to the power actor that stores the power table, post bootstrap.
  2. When the network version bumps to v25, record the power table in the cron tick. We're using a network version so we can disable F3 by disabling a migration.
  3. Expose this power table via some F3InitialPowerTable -> Option<Cid> function.
  4. Change the bootstrap logic: At every epoch, lookback finality (900 epochs) and call F3InitialPowerTable. If it returns a power table, bootstrap F3 with that power table.

@Stebalien
Copy link
Member

Note: the alternative is to do this in the migration itself. However, I'd like to:

  1. Do some mainnet testing with the final-final version before launching F3.
  2. Avoid 2 state migrations.

@Stebalien
Copy link
Member

Ah, so, we need all the worker keys. This is best done through a migration of some form, unfortunately.

@Stebalien
Copy link
Member

Ok, discussed with @jennijuju: we can do two migrations but avoid migrating the actor code in the second migration. Instead, the second migration will just create the power table and attach it to the power actor.

@rjan90
Copy link
Contributor

rjan90 commented Sep 11, 2024

Some open questions for option 2 is how do we write the migration? Is there a need to create a nv-skeleton in Lotus/GST/Filecoin-FFI? Will it be similar to the Lightning/Thunder upgrade?

We should also give Forest a early heads up on our strategy here, so that they can prep for this migration.

@BigLep
Copy link
Member

BigLep commented Sep 11, 2024

Additional 2024-09-11 conversation:

  • @Stebalien is going to write up options
  • This is going to involve a FIP update to document the change.

I added these tasks to the issue description:

  • Enumerate the options (with pros/cons or in a decision table)
  • Get decision made with stakeholders
  • Update FIP with decision
  • Lotus implementation work
  • Forest implementation work

Please update/correct where wrong or outdated.

@BigLep BigLep added the P1 label Sep 11, 2024
@Stebalien
Copy link
Member

We discussed the migration option in standup. Unfortunately, Forest would have to implement the migration as well and the migration will likely be non-standard (likely) because we don't want to bump the actors version to make the migration small. We can still do that, but we need to discuss it with them.

We also discussed some alternatives:

  1. When a user syncs from a snapshot after the F3 bootstrap epoch (syncs from a snapshot that doesn't include the power table from the bootstrap epoch), either (a) require that they use a new client version that hard-codes the bootstrap powertable CID or (b) require that they pass said power table CID on the command-line when importing a snapshot. The downside of this approach is that it requires user intervention and could cause issues for automated deployment setups.
  2. We could just add the CID to snapshots (e.g., stuff it in the CAR header). From what I can tell, this isn't too terrible, but it's kind of an abuse of the CAR format. This will require cooperation from chainsafe (they produce the snapshots) but the effort should be minimal.
  3. We could preserve the F3 bootstrap state-tree (both in the datastore and in the snapshot). This will require some work, but not too much work. However, this will keep extra state around, which will grow the datastore a bit.

@Stebalien
Copy link
Member

I've discussed this with the F3 team and @jennijuju and it sounds like option 1 isn't so bad after all.

We'd have two releases:

  1. Release A: Before the network upgrade.
  2. Release B: Immediately after the network upgrade.

Release A will have (a) an environment variable to specify the F3 bootstrap power table CID, (b) the ability to specify it when importing a snapshot, and (c) will be able to import snapshots without specifying the variable (?) (we'll have to assess the risk of this as the peer won't be able to participate in F3).

Release B will be identical to release A except the bootstrap power table CID will be set.

We'll need to coordinate with Forest/Venus to make sure this works for them.

@Stebalien
Copy link
Member

While writing this up, I did have another thought... technically, we can start late and our certificate store even supports this (technically). To bootstrap, we:

  1. Fetch the earliest finality certificate signed by a power table we have.
  2. Validate that finality certificate.
  3. Start from there.

@ruseinov
Copy link

  • Fetch the earliest finality certificate signed by a power table we have.
  • Validate that finality certificate.
  • Start from there.

Correct me if I understand this wrong: we're looking for the earliest cert signed by the current PT and then just verify all the subsequent certificates until the boostrap is finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In progress
Development

No branches or pull requests

5 participants