Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Virtual TPM for VMs and Containers #4071

Merged
merged 9 commits into from
Sep 3, 2024

Conversation

shjala
Copy link
Member

@shjala shjala commented Jul 9, 2024

Problem with existing VTPM service

First of all, VTPM has a misleading name since it is not actually a Virtual TPM, but a HTTP server which through its APIs, exposes a limited set of the host's TPM functionalities to the callers. As a result, there is no TPM device present in the VM (no /dev/tpm* ). This is problematic since some services (for example Azure Identity Service) expects a TPM device is available and is programmed to communicate directly with the TPM for credential transfer and signing purposes.

To fix this issue, we patch the Azure Identity Service to translate its TPM calls to VTPM calls (the patch set is maintained in EVE-TOOLS repository). This approach while working, can lead to maintenance issues. Currently we support AZIOT version 1.2. We claim to support the 1.4 version too, but in reality we revert the Azure Identity Service to a commit between version 1.2 and 1.3 to apply the patches and are lucky that compiled binaries are still compatible with version 1.4.

Full Virtual TPM in VMs and Containers

As mentioned above, using VTPM is not optimal. VTPM comes with its own set of problems1, in addition it creates a limiting environment and any external program that relies on a TPM requires creating and maintaining its own set of patches. To solve this problem, This PR uses Software TPM (SWTPM) to emulate a Virtual TPM per VMs.

Fortunately SWTPM integrates well with QEMU, in order to emulate a Virtual TPM in the VM, we create a SWTPM instance per VM. The SWTPM instance is configured to use a Unix Domain Socket as a communication line, after passing the socket path to the QEMU virtual TPM configurations, it automatically creates a virtual TPM device for the VM which is accessible like a normal TPM under /dev/tpm* . We can create the same setup for bare metal containers using Linux Virtual TPM Proxy driver to create arbitrary virtual TPM devices on the host, backed by a SWTPM process and mapped to the container as /dev/tpmX device.

SWTPM saves and loads the TPM state on/from the disk, so at the next VM boot all the TPM keys, TPM NVRAM data, etc are present. In addition SWTPM is configured to encrypt the TPM state files using a 256 bit AES key (we use the same key as vault key, this key stored in TPM and access is authorized using a PCR policy, this makes it easier to manage since if we use a separate key for SWTPM states, there is one more key to backup/restore from the controller). We can pass the constant VM id and the encryption key through a KDF to create unique keys for each VM/Container if needed, but it brings no added security. Here is a diagram, showing the basic architecture of the new VTPM:

image

1: For example it relies on a set of TPM management utilities (TPM-TOOLS) to communicate with the host TPM, this dependency not only increase the attack surface, but also can cause maintenance problem, because TPM-TOOLS is consist of a set of command-line utilities (Not APIs) and a next version of TPM-TOOLS can simply change the command line arguments behavior and break the VTPM functionality without notice.

TODO

  • Update the documents
  • rename old vtpm to ptpm (Proxy TPM) and move it a new package
  • fix apparmor profiles for the new vtpm/swtpm
  • fix apparmor profiles for ptpm
  • add eden test
  • sign swtpm EK with actual TPM ek to create a chain of trust
  • how to prevent attacker from cloning a swtpm instance (encrypted state)? what is our attack model?
  • make sure swtpm state pass the robustness tests (sudden shutdown and information loss) -- see Full Virtual TPM for VMs and Containers #4071 (comment)
  • Update alpine in another PR instead of using shahzededa/eve-alpine (Update alpine mirrors for 3.16 to build swtpm #4091)
  • Use tpm proxy kernel driver to support virtual TPMs in bare-metal containers (separate PR)
  • move old vtpm to ptpm (separate PR)
  • make sure each swtpm instance gets its's own unique and persistent seed (and by extension SRK, EK, etc) -- see Full Virtual TPM for VMs and Containers #4071 (comment)
  • update lfedge/eve-alpine everywhere

Eden Test

lf-edge/eden#1011

Manual Tests

So far I have tested the following with a Ubuntu 22.04.2 LTS :
1- Sign and verify with the TPM using the endorsement hierarchy [successful]

ubuntu@vm01:~$ tpm2_createprimary -C e -c primary.ctx
tpm2_create -G rsa -u rsa.pub -r rsa.priv -C primary.ctx
tpm2_load -C primary.ctx -u rsa.pub -r rsa.priv -c rsa.ctx
echo "my message" > message.dat
tpm2_sign -c rsa.ctx -g sha256 -o sig.rssa message.dat
tpm2_verifysignature -c rsa.ctx -g sha256 -s sig.rssa -m message.dat
name-alg:
  value: sha256
  raw: 0xb
attributes:
  value: fixedtpm|fixedparent|sensitivedataorigin|userwithauth|restricted|decrypt
  raw: 0x30072
type:
  value: rsa
  raw: 0x1
exponent: 65537
bits: 2048
scheme:
  value: null
  raw: 0x10
scheme-halg:
  value: (null)
  raw: 0x0
sym-alg:
  value: aes
  raw: 0x6
sym-mode:
  value: cfb
  raw: 0x43
sym-keybits: 128
rsa: 8e623d2f8e575d615aa243414e5e36d388f9fd5481a8fe350cb17843c7b209df4b4518aaa331bcc6ec13996913d676c88fd0cee45131b287622a3610268ea472d73ba6ea99f28d0292609715401a1aaaf735f138f37b86cb0af30b7551c7b25870ad2569fd7cfd5d0e307cd4df2ea14fc8fe9af0694cb368d17510055745bf06b2fed45175d46656cea4b74e6d53dcb6a500986d5d02f5ef52f7b72edd24be450c9449699fa25e395976040834dae0f752f0f80520b4ab49d80ff39ceaa98dd77f768307b0cecfd1abe3a71fc09c0009805bfbe06a8a554077c25e116cc89e8e243e67e1a6eed4376d5abffa7d738c80c54c0d84e6bbdaedea9f4da6aaf59e43
name-alg:
  value: sha256
  raw: 0xb
attributes:
  value: fixedtpm|fixedparent|sensitivedataorigin|userwithauth|decrypt|sign
  raw: 0x60072
type:
  value: rsa
  raw: 0x1
exponent: 65537
bits: 2048
scheme:
  value: null
  raw: 0x10
scheme-halg:
  value: (null)
  raw: 0x0
sym-alg:
  value: null
  raw: 0x10
sym-mode:
  value: (null)
  raw: 0x0
sym-keybits: 0
rsa: ce920e66075bc14aed1f0956296dc78433681e055b616d5e09a0a9b5a68ab6ea81b959f7e636ae9e6236786d14f84632cf5c1d8c74fb1e339ba782960d50d63ffc4f0e709ee3a68614a5a63b47ffbc37408f67da758e3009b8cc25cc2bd03c814db2e9a9ca9ba7192be9c579a0ce13a383f50f7540567fa21f1f5ce70ce76ea1e70ab64e233acc1cb2b497cffa0a047f167b8fe3795704178f141d2028c2d585bf34aec9f4a23cbc661ec81fabf6ca92b4fb6af3818d2e4bdfd2edf165039604e366226d3a37fc0ffcfd60be08bae2abdfb8711879fe7711b5db6ee98af040d7ac295668c41f69075761b418ccd6d64539d2da231497f854893c96c72868133b
name: 000bfe394bb93fc82e2e6fae6f6c4dab6199215674727f5c287aa3641eff4d8f3779
ubuntu@vm01:~$ echo $?
0

2- Sign with the TPM and verify with openssl [successful]

ubuntu@vm01:~$ openssl ecparam -name prime256v1 -genkey -noout -out private.ecc.pem
openssl ec -in private.ecc.pem -out public.ecc.pem -pubout
# Generate a hash to sign
echo "data to sign" > data.in.raw
sha256sum data.in.raw | awk '{ print "000000 " $1 }' | \
xxd -r -c 32 > data.in.digest
# Load the private key for signing
tpm2_loadexternal -Q -G ecc -r private.ecc.pem -c key.ctx
# Sign in the TPM and verify with OSSL
tpm2_sign -Q -c key.ctx -g sha256 -d -f plain -o data.out.signed data.in.digest
openssl dgst -verify public.ecc.pem -keyform pem -sha256 \
-signature data.out.signed data.in.raw
read EC key
writing EC key
Verified OK

Manually Test AzureIoTEdge (Ubuntu 20.04, Aziot 1.4.0 and EVE-TOOLS)

Manually deploying a ubuntu-20.04-server-amd64 and running the test-script I wrote for eden :

ubuntu@vm01:~$ sudo iotedge system status
System services:
    aziot-edged             Running
    aziot-identityd         Running
    aziot-keyd              Running
    aziot-certd             Running
    aziot-tpmd              Running

Manually Test AzureIoTEdge (Ubuntu 20.04, Aziot latest (1.5.x) without EVE-TOOLS, using Virtual TPM)

Manually deploying a ubuntu-22.04-server-amd64 and running the test_make_tpm_keys and then test_ubuntu22.04_aziot_latest :

[00:31:28.097] ubuntu@vm01:~$ sudo iotedge system status
System services:
[00:31:35.783]     aziot-edged             Running
[00:31:35.784]     aziot-identityd         Running
[00:31:35.785]     aziot-keyd              Running
[00:31:35.785]     aziot-certd             Running
[00:31:35.786]     aziot-tpmd              Running
[00:31:35.787]

@shjala shjala requested a review from eriknordmark as a code owner July 9, 2024 10:44
@shjala shjala marked this pull request as draft July 9, 2024 10:44
@shjala shjala changed the title [WIP] virtual TPM for vm and containers [WIP] Virtual TPM for VMs and Containers Jul 9, 2024
@shjala shjala changed the title [WIP] Virtual TPM for VMs and Containers [PoC] Virtual TPM for VMs and Containers Jul 11, 2024
@shjala shjala changed the title [PoC] Virtual TPM for VMs and Containers Full Virtual TPM for VMs and Containers Jul 15, 2024
@shjala shjala marked this pull request as ready for review July 16, 2024 13:58
@shjala
Copy link
Member Author

shjala commented Jul 16, 2024

Please don't mind the many commits, I will squash them later. This is more or less ready for review even though there are tasks on the list (I'll do while the review is going on).

@shjala shjala force-pushed the swtpm_in_vm branch 3 times, most recently from d17afe0 to e34bc2d Compare July 17, 2024 08:53
@eriknordmark
Copy link
Contributor

Isn't /dev/tpm0 the TPM 1.X device, and /dev/tpmrm0 the TPM 2.0 device?

I assume we need to keep the current vtpm linuxkit container around since there are some AZIOT VMs which use the existing eve-tools to access that.

@eriknordmark
Copy link
Contributor

sign swtpm EK with actual TPM ek to create a chain of trust

Would it make sense to also add some way for an app instance to find out about the host's TPM? Or will that be implicit in the signature it will see for its EK?
(If need be we can export information to app instances using the meta-data server.)

how to prevent attacker from cloning a swtpm instance (encrypted state)? what is our attack model?

It might make sense to have the encryption key be a function of the App instance UUID. That would prevent someone from reusing it in another VM they control.

Note that I haven't thought through the attack scenario in detail - other than a user with physical access can temporarly remove the disk and mount it on a different computer and copy /persist/foo/A to /persist/foo/B. But if we put this under /persist/vault/ then that possibility goes away.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite hard to review this PR due to the choice to rename the existing vtpm to ptpm and having the new be called vtmp.

Can we defer that rename until the end of the process so that the diffs make some more sense?

docs/VTPM.md Outdated Show resolved Hide resolved
docs/VTPM.md Outdated Show resolved Hide resolved
@shjala
Copy link
Member Author

shjala commented Jul 24, 2024

Isn't /dev/tpm0 the TPM 1.X device, and /dev/tpmrm0 the TPM 2.0 device?

/dev/tpmrm0 is same as /dev/tpm0 but with the addition of automatic resource management, basically a tpm2-abrmd in the kernel. It helps managing TPM resources otherwise if raw /dev/tpm0 is used as TPM interface, it is common to exhaust TPM resource fairly quickly and hit errors.

AFAIK TPM version is determined by GetCap packet and examining the returned version, here we only support TPM 2.0 with the --tpm2 arg passed to swtpm.

I assume we need to keep the current vtpm linuxkit container around since there are some AZIOT VMs which use the existing eve-tools to access that.

yes, I will keep it running in the VTPM container alongside the SWTPM till all have upgraded to newer versions of AZIOT with no eve-tools patches.

@shjala
Copy link
Member Author

shjala commented Jul 25, 2024

It is quite hard to review this PR due to the choice to rename the existing vtpm to ptpm and having the new be called vtmp.

Can we defer that rename until the end of the process so that the diffs make some more sense?

should be less cluttered now.

@shjala
Copy link
Member Author

shjala commented Sep 2, 2024

  • Rebased against master and resolved the merge conflict.
  • Running Yetus locally it complains about auto generated patch file and invalid flag: --keep-git-dir, there is issue open for the later in Hadolint repo unexpected '-' when using --keep-git-dir hadolint/hadolint#951
  • commit message linter complains about one line commit messages (which is enough) and body not starting with a capital letter (which I think it is too restrictive, and waste of runners time and energy to push again just to make a letter capital).

@OhmSpectator
Copy link
Member

Is there a way to temporarily suppress the Yetus warning?

Regarding the capital letter. Yep, in some cases, it can be unnecessary. Still, in others, when you write a sentence, I find it useful as it can signal that the commit message was not carefully written in a "get off me" style.

@shjala
Copy link
Member Author

shjala commented Sep 2, 2024

Is there a way to temporarily suppress the Yetus warning?

No I think it is not a rule but parser that has no idea about --keep-git-dir, my haskell is not good enough to send a patch.

@OhmSpectator
Copy link
Member

I've downloaded the Yetus results from here to check if there are other issues as well, but the reports are empty... Strange.

@shjala
Copy link
Member Author

shjala commented Sep 2, 2024

I've downloaded the Yetus results from here to check if there are other issues as well, but the reports are empty... Strange.

Yes I tried that too, here is my local run :

$ make mini-yetus MYETUS_VERBOSE=y
Running mini-yetus
./tools/mini-yetus.sh -f  
[+] Running mini-yetus
[+] No source branch specified. Using the main branch: master
[+] No destination branch specified. Using the current branch: swtpm_in_vm
[+] Running yetus on the changes...
[+] Full results:
.codespellignorelines:10:0:blanks:tabs in line
.codespellignorelines:1:0:blanks:tabs in line
.codespellignorelines:11:0:blanks:tabs in line
.codespellignorelines:12:0:blanks:tabs in line
.codespellignorelines:2:0:blanks:tabs in line
.codespellignorelines:3:0:blanks:tabs in line
.codespellignorelines:4:0:blanks:tabs in line
.codespellignorelines:5:0:blanks:tabs in line
.codespellignorelines:6:0:blanks:tabs in line
.codespellignorelines:7:0:blanks:tabs in line
.codespellignorelines:8:0:blanks:tabs in line
.codespellignorelines:9:0:blanks:tabs in line
.hadolint.yaml:1:1:yamllint: warning: missing document start "---" (document-start)
/home/shah/shah-dev/eve/pkg/pillar/hypervisor/kvm.go:198:0:detsecrets:db3d405b10675998c030223177d42e71b4e7a312:Secret Keyword
/home/shah/shah-dev/eve/pkg/pillar/hypervisor/kvm.go:4:1:revive: should have a package comment https://revive.run/r#package-comments
/home/shah/shah-dev/eve/pkg/pillar/hypervisor/kvm_test.go:2371:0:detsecrets:db3d405b10675998c030223177d42e71b4e7a312:Secret Keyword
/home/shah/shah-dev/eve/pkg/pillar/hypervisor/kvm_test.go:56:0:detsecrets:e7ea4f94cb4af75c6643566ca6d95d9433b8a6f2:Secret Keyword
/home/shah/shah-dev/eve/pkg/pillar/scripts/onboot.sh:121:0:shelldocs:ERROR: function percent_used has no @audience
/home/shah/shah-dev/eve/pkg/pillar/scripts/onboot.sh:121:0:shelldocs:ERROR: function percent_used has no @stability
/home/shah/shah-dev/eve/pkg/pillar/scripts/onboot.sh:139:0:shelldocs:ERROR: function free_space has no @audience
/home/shah/shah-dev/eve/pkg/pillar/scripts/onboot.sh:139:0:shelldocs:ERROR: function free_space has no @stability
/home/shah/shah-dev/eve/pkg/vtpm/Dockerfile:33:25:hadolint: invalid flag: --keep-git-dir
/home/shah/shah-dev/eve/pkg/vtpm/patch/patch-tpm2-tools.diff:11:0:blanks:end of line
/home/shah/shah-dev/eve/pkg/vtpm/patch/patch-tpm2-tools.diff:15:0:blanks:end of line
/home/shah/shah-dev/eve/pkg/vtpm/patch/patch-tpm2-tools.diff:17:0:blanks:end of line
/home/shah/shah-dev/eve/pkg/vtpm/patch/patch-tpm2-tools.diff:24:0:blanks:end of line
/home/shah/shah-dev/eve/pkg/vtpm/patch/patch-tpm2-tools.diff:7:0:blanks:end of line
.syft.yaml:7:1:yamllint: warning: missing document start "---" (document-start)
[+] Results stored in /tmp/tmp.GXzKfYfdpI/yetus-output

@OhmSpectator
Copy link
Member

Ok, in general, it looks like it is ready to be merged. Does anyone who reviewed the logic have any more comments, or are all satisfied? @milan-zededa, @christoph-zededa, @eriknordmark ?

@eriknordmark
Copy link
Contributor

I see there is still two TODOs which are not checked. The alpine tag can be done separately, but I wonder about " make sure each swtpm instance gets its's own unique and persistent seed (and by extension SRK, EK, etc)". Is this done?

@shjala
Copy link
Member Author

shjala commented Sep 2, 2024

I see there is still two TODOs which are not checked. The alpine tag can be done separately, but I wonder about " make sure each swtpm instance gets its's own unique and persistent seed (and by extension SRK, EK, etc)". Is this done?

yes it is :

shah@shah:~/shah-dev/eve/swtpm-seed-test$ cat test-swtpm-seed.sh

#!/bin/bash

CWD=$(pwd)
TPM_SRV_PORT1=2000
TPM_CTR_PORT1=2001
TPM_SRV_PORT2=3000
TPM_CTR_PORT2=3001

EK_HANDLE=0x81000001

TPM_STATE1=/tmp/swtpm-seed-test1
TPM_STATE2=/tmp/swtpm-seed-test2

rm -rf $TPM_STATE1
rm -rf $TPM_STATE2
mkdir -p $TPM_STATE1
mkdir -p $TPM_STATE2

flushtpm() {
  tpm2 flushcontext -t
  tpm2 flushcontext -l
  tpm2 flushcontext -s
}

swtpm socket --tpm2 \
    --server port="$TPM_SRV_PORT1" \
    --ctrl type=tcp,port="$TPM_CTR_PORT1" \
    --tpmstate dir="$TPM_STATE1" \
    --flags not-need-init,startup-clear &
PID1=$!

swtpm socket --tpm2 \
    --server port="$TPM_SRV_PORT2" \
    --ctrl type=tcp,port="$TPM_CTR_PORT2" \
    --tpmstate dir="$TPM_STATE2" \
    --flags not-need-init,startup-clear &
PID2=$!

# create first EK and export it
export TPM2TOOLS_TCTI="swtpm:host=localhost,port=$TPM_SRV_PORT1"
tpm2 clear
tpm2 createek -c ek1.ctx
flushtpm
tpm2 readpublic -Q -c ek1.ctx -f pem -o ek1.pem

# create second EK and export it
export TPM2TOOLS_TCTI="swtpm:host=localhost,port=$TPM_SRV_PORT2"
tpm2 clear
tpm2 createek -c ek2.ctx
flushtpm
tpm2 readpublic -Q -c ek2.ctx -f pem -o ek2.pem

kill $PID1
kill $PID2
rm ek1.ctx ek2.ctx
rm -rf $TPM_STATE1 $TPM_STATE2
shah@shah:~/shah-dev/eve/swtpm-seed-test$
shah@shah:~/shah-dev/eve/swtpm-seed-test$
shah@shah:~/shah-dev/eve/swtpm-seed-test$
shah@shah:~/shah-dev/eve/swtpm-seed-test$ ./test-swtpm-seed.sh
shah@shah:~/shah-dev/eve/swtpm-seed-test$ ls
ek1.pem  ek2.pem  test-swtpm-seed.sh
shah@shah:~/shah-dev/eve/swtpm-seed-test$ diff <(openssl rsa -in ek1.pem -pubin -outform DER -pubout | openssl dgst -sha256) \
     <(openssl rsa -in ek2.pem -pubin -outform DER -pubout | openssl dgst -sha256)
writing RSA key
writing RSA key
1c1
< SHA2-256(stdin)= 90d39a7ff6f1b377898d1dc7867d63e4f6c09c24a39b8eabc0d61a3cbc4edfab
---
> SHA2-256(stdin)= 01d4226284fd74c939ebae367dea6d51c22e3f496337ead3b8b52c83a017262f
shah@shah:~/shah-dev/eve/swtpm-seed-test$

pkg/vtpm/build.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added three questions as comments, and it seems like commitlint wants to to have sentences which start with a captital letter.

Running the tests

@shjala shjala force-pushed the swtpm_in_vm branch 2 times, most recently from b13bc8e to 7839ea7 Compare September 3, 2024 08:16
Just a clean up and renaming old vtpm binary to its new name ptpm.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Move the patch file to a more appropriate location.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Create nessessary directories in /persist and /run
for swtpm and vtpm to store their data and set the
correct permissions.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Remove unnecessary access to /config and add access
to /persist in the vtpm container. /persist is used
to store the TPM state.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Add apparmor profiles for swtpm and ptpm and update
the existing profiles for vtpm to reflect the changes.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Move old vtpm docs to PTPM.md and update VTPM.md
with new content reflecting the new vtpm service.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
…e VMs

This commit brings SWTPM in EVE, in order to emulate Virtual TPM in
the VMS, making TPM accessible in Vms like a normal TPM device under /dev/tpm* .

SWTPM saves and loads the TPM state on/from the disk, so at the next VM
boot all the TPM keys, TPM NVRAM data, etc are present. In addition SWTPM
is configured to encrypt the TPM state files using a 256-bit AES key. We
use the same key as vault key, this key stored in TPM and access to it is
authorized using a PCR policy, this makes it easier to manage since if
we use a separate key for SWTPM states, there is one more key to
backup/restore from the controller.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Restrict vtpm access to needed directories only, it
doesn't need access to all of the /persist and /run,
just /persist/swtpm and /run/swtpm.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Just in case the swtpm takes a bit longer to start up,
due to the system being busy or TPM being slow to
release the encryption key, increase the timeout
to 10 seconds.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
@eriknordmark eriknordmark merged commit 0c5c458 into lf-edge:master Sep 3, 2024
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants