Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: extract core resource types #30

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions 030-extract-core-resources/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Summary

Concourse currently ships with the following "core" resource types:

* [`bosh-io-release`](https://github.com/concourse/bosh-io-release-resource)
* [`bosh-io-stemcell`](https://github.com/concourse/bosh-io-stemcell-resource)
* [`cf`](https://github.com/concourse/cf-resource)
* [`docker-image`](https://github.com/concourse/docker-image-resource)
* [`git`](https://github.com/concourse/git-resource)
* [`github-release`](https://github.com/concourse/github-release-resource)
* [`hg`](https://github.com/concourse/hg-resource)
* [`mock`](https://github.com/concourse/mock-resource)
* [`pool`](https://github.com/concourse/pool-resource)
* [`registry-image`](https://github.com/concourse/registry-image-resource)
* [`s3`](https://github.com/concourse/s3-resource)
* [`semver`](https://github.com/concourse/semver-resource)
* [`time`](https://github.com/concourse/time-resource)
* [`tracker`](https://github.com/concourse/tracker-resource)

This proposal is to remove them all from the `.tgz` distribution, leaving only the `registry-image` resource, as it will be necessary for fetching other resource types.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the registry-image-resource built-in might still lead to versioning issues, as different ATCs could have different versions of the registry-image-resource.

Instead of bundling in the registry-image-resource, we could have built-in image fetcher functionality in Concourse that is only used to fetch the core resource types, i.e. the ones without a parent type that are defined at startup in the resource-types.yml.

These images could be fetched when the relevant core resource type is being used. Currently, if we run a check for a resource, we import the image of its base resource type into Baggageclaim and use that to create a check container. Instead of importing it, we could fetch the image using the built-in image fetcher and then treat it as a regular volume from that point on.

We would know that a given resource type is a core resource type because

  1. it would not have a parent type
  2. it would not be associated with a pipeline.

Any custom resource type or normal Concourse resource would still be fetched using the registry-image-resource once we have it, which means that regular pipeline flow would not change.

TBD: Do we want to surface the core resource type image fetching logs somehow? Currently we do not surface logs for checks.

Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered offline environments? We are using Concourse in 3 air-gapped environments on AWS, which heavily rely on S3 since they can't reach the public docker registry. The change proposed here would require us to maintain a docker registry.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvshahid do you use any resource types other than the core resource types? Wouldn't you therefore need a docker registry anyway to use any non-core resource types?

Additionally - if you can use S3 in your environment, can you use ECR (AWS's managed Docker registry)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the S3 resource. To be more specific, this is a US Federal user deploying to a separate partition of AWS. I am not sure if ECR is available in that region/partition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvshahid For what it's worth, the AWS Region Table does claim that ECR is available in US GovCloud (both West and East).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if services are provider, we've have users that are not permitted to use these services. For instance, a user cannot host docker images because it requires security scanning and can slow their development of pipelines. The scanning of multiple pieces of software can take longer than one large one.

Copy link
Member Author

@vito vito Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make it easy to install a resource type archive into a local directory for the worker to load up, which is exactly what's happening today under the hood - it's just automated on start, and not documented for public use. Would that be sufficient for these use cases? Operators would just need to grab the archives (e.g. from our GitHub releases) and install them to their cluster.

Something like this:

$ mkdir /opt/concourse/resource-types
$ cd /opt/concourse/resource-types
$ wget https://github.com/concourse/git-resource/releases/download/v1.7.2/git-resource-1.7.2-ubuntu.tgz
$ concourse-worker --resource-types /opt/concourse/resource-types

With proper tooling this should be pretty easy to automate for a worker pool.

Copy link

@jtarchie jtarchie Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that would work for us/them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We relied on this technique in the past to add custom resource types to BOSH deployed Concourse. This method broke when Concourse moved the resource types from /var/vcap/packages/foo-resource-type to /var/vcap/packages/concourse/resource-types/foo. The new directory structure makes it weird to add new custom types since we will have to modify the Concourse package on disk.

I am mentioning this to make sure the extension points proposed above don't suffer from similar drawbacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvshahid Thanks, will keep that in mind! I think if we can get the containerd track of work wrapped up before this, we might be able to make it as simple as placing OCI image tarballs in a directory. 🤞


# Motivation

1. Packaging them all up with Concourse inflates the download size, especially as we standardize on `ubuntu` as the base image instead of `alpine`.

1. Having commonly-used resource types bundled into the core set makes them feel "special/blessed" and may result in a delayed introduction to/understanding of `resource_types:` for new users.

1. A few of these resource types are awfully specific and not likely to be used by the majority of Concourse users:

* `bosh-io-release`
* `bosh-io-stemcell`
* `cf`
* `tracker`

# Proposal

Removing them is the easy part. The harder part is doing this without requiring everyone to update their pipelines.

We could do this by having an operator-specified config of resource types to automatically provide to each pipeline:

```yaml
---
resource_types:
- name: git
type: registry-image
source:
repository: concourse/git-resource
tag: 1
- name: time
type: registry-image
source:
repository: concourse/time-resource
tag: 1
```

This would be passed to `concourse web` on startup like so:

```sh
concourse web ... --core-resource-types resource-types.yml
```

This would have the effect of being merged into each pipeline's `resource_types:` (with the pipeline taking precedence).

Operators should be careful to provide a sane `tag:` value so that newly shipped versions do not break all pipelines. At the very least a major version should be specified. This is up to the operator's discretion, however, and will not be enforced.

# Open Questions

* This will result in a lot more polling against the Docker registry. Should we have a longer default check interval for resource types?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even disable checking for new version? To keep the exact behaviour as now, where included resource types are only updated when updating Concourse. Otherwise this introduces a kind of silent updates that can't be blocked

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these core resources would be pinned to specific image digest (ie the sha256:… value) then there would be no need to check registry again, right?

But looks like current registry-image resource doesn't support digest-based pinning yet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently resources can use version when specifying an exact version of resource to use. I could see this being added to resource types.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of right now, we do not check base resource types. We could retain this behaviour to avoid the extra checks.

Currently, we only check resource types which are defined as part of a pipeline. There is no such thing as a resource type that does not exist in a pipeline. However if we modeled these core resource types as regular resource types, no checks would occur because they are not defined in a pipeline.

When fetching these resources, we could just fetch latest by default or the version specified in the yaml as @jtarchie suggested. If you want a new version of the base resource type, you have to change the resource-types.yml and restart the ATC.

Copy link

@deniseyu deniseyu Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to think about:

Resource type versions are taken into account when a resource's version history is composed, to be shown in UI. Currently (at least as of the v5.x line - sorry I'm not exactly sure when this change was introduced) when custom resource type's image is updated, versions fetched with an older version of the resource type aren't shown by default in the resource UI. (You can force them to come back by running fly check-resource.) If the fetching mechanism is changed across the board for all resource types, then users are going to start seeing this edge case a lot more frequently. Here's more context on the issue that I saw: concourse/concourse#4893

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely vote for erring on the side of a longer check interval. Like even once a day. But maybe this can be configurable in the resource-types.yml that operators are feeding in for pipeline-wide config.

Probably quite a few Concourse users are routing outgoing traffic through load balancers or NAT boxes or stuff that will make it look like all traffic is coming from the same IP, and we're starting to see rate limiting and throttling from Docker Hub. This is not scientific evidence but in this issue thread, it looks like lots of users of other CI/CD tools have also experienced similar: docker/hub-feedback#1907

* Should Concourse come with a default set of operator-provided resource types?
* If so, which ones?
* Are we ready to rely on the `registry-image` resource? See [concourse/concourse#3092](https://github.com/concourse/concourse/issues/3092).

# Answered Questions

* I don't think we should add team-level `resource_types:` configuration. That's too many layers to check for a config.

# New Implications

* We'll have to be careful not to break the latest version of each resource type, as all Concourse clusters will quickly pick up the new version (whether from `tag: latest` or `tag: 1`).