Discussion:
Building cloud images using Debian infrastructure
Bastian Blank
2018-03-18 14:49:48 UTC
Permalink
Hi

I again did some work on building the cloud images. I did work on how
to schedule builds, how to perform builds and how to get data where we
need it. The whole thing uses infrastructure Debian provides, the
exception is the image release step.

The components are:
- salsa.debian.org with GitLab, responsible for storing code, scheduling
builds, storing logs and integrity details for eternity and storing
image data for a short amount of time.
- casulana.debian.org with kvm, responsible for providing the
environment to perform builds, perform tests.
- Somewhere for storing data with user access.
- Some environments may need additional systems (EC2 can only use
snapshots of existing disks, so the image release step must run
there).

All builds are orchestrated by GitLab CI, running on salsa.debian.org.
All the build definition resides in the same Git repo as the FAI config
and scripts, so as much information as possible is distributed in one
place. Builds are split up into jobs, each run independently and
depending on another. For now I'd like to define the following jobs:

- Build images for all supported environments and dists.

On each push, regardless of the branch, a subset of these builds is
performed. All the other builds are only performed on the scheduled
runs.

On casulana we should be able to do a full set of builds in about 10
minutes. Right now the compression of the build result dominates the
required cpu time, as it uses xz -9 to get the output down to a usable
size for storage.

Each build runs in a scratch Docker environment via a special
configured GitLab Runner. The builds need access to loop devices,
which is by default not allowed. (Yes, I'm aware that neither Docker,
nor GitLab Runner, have suitable versions in Debian Stretch.)

Some of these builds will currently fail (Azure on Buster, GCE on
Buster and Sid), so there status will be ignored.

I have a test repo that performs this operation already:
https://salsa.debian.org/waldi/fai-cloud-images/

Full build:
https://salsa.debian.org/waldi/fai-cloud-images/pipelines/3155

- Tests for images.

I'm not sure if scheduled builds should perform detail tests on all
platforms, or if this should be restricted to releases and explicit
triggers.

At least we should do a minimal test and look if the system boots to
userspace (call qemu and read serial output until systemd shows
itself; this weeds out broken bootloader, kernel and filesytem). Even
without kvm this takes less then 30 seconds, so is an easy to perform
test.

- For all scheduled runs, upload images and metadata to a user
accessible storage.

- A manual job to release the images. This triggers a pipeline from a
different project. This new pipeline contains the following jobs:

- Upload image to platforms.
- Test new instance using the images.
- Publish images.
- Notify debian-***@.

While this proposal introduces some complexity, it uses common
components and does all the special stuff itself. It allows a user to
setup his own environment with only a small amount of effort:

- Setup GitLab or use an existing instance.
- Setup a GitLab Runner with Docker to run the builds, using some
documented config options.
- Configure some build options.

Regards,
Bastian
--
Our missions are peaceful -- not for conquest. When we do battle, it
is only because we have no choice.
-- Kirk, "The Squire of Gothos", stardate 2124.5
Bastian Blank
2018-03-24 14:51:58 UTC
Permalink
Post by Bastian Blank
- Tests for images.
I'm not sure if scheduled builds should perform detail tests on all
platforms, or if this should be restricted to releases and explicit
triggers.
I thought about tests again. I think we should do a test with a mocked
metadata service for each build, so we know that the image is able to
provision. This should take no more than two minutes.

The EC2 metadata protocol is easy and can be served from a static
server. GCE seems to be static as well. We just somehow need
169.254.169.254 accessible, but qemu network guest forwards should take
care of that.

For Azure I'm not sure, as the agent generates a RSA key and somehow
uses it to communicate with the old metadata service; I have not jet
worked with the new service used on Azure Stack.

Any thoughts?

Are there more images of the stuff you wrote on the board at the last
cloud sprint? I only know about
https://photos.app.goo.gl/5066UjdxGNiH6Qur2

Regards,
Bastian
--
Where there's no emotion, there's no motive for violence.
-- Spock, "Dagger of the Mind", stardate 2715.1
Thomas Lange
2018-03-24 18:46:44 UTC
Permalink
I've uploaded two pictures of the whiteboard to this URL
https://informatik.uni-koeln.de/public/lange/debian-cloud-2017/
--
regards Thomas
Tomasz Rybak
2018-03-25 20:15:49 UTC
Permalink
Post by Bastian Blank
Post by Bastian Blank
- Tests for images.
I'm not sure if scheduled builds should perform detail tests on all
platforms, or if this should be restricted to releases and
explicit
triggers.
I thought about tests again. I think we should do a test with a mocked
metadata service for each build, so we know that the image is able to
provision. This should take no more than two minutes.
What do you mean by that?
I've recently been thinking about how to test our images,
and how to integrate test we've been working on during
the spring; haven't come to any solutions yet, though.

But it would be good place to start.
Should I create new thread, so we can discuss it?

Best regards.
--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C
Bastian Blank
2018-03-26 19:12:48 UTC
Permalink
Post by Tomasz Rybak
What do you mean by that?
I've recently been thinking about how to test our images,
and how to integrate test we've been working on during
the spring; haven't come to any solutions yet, though.
Okay, let's outline my solution, which I didn't test yet. For the
official builds all this stuff runs in kvm instances with nested-vmx on
casulana.

- Each test runs in a disposable Docker environment.

This means we have a completely scripted test environment. This means
less chance that documentation and implementation gets out of sync.
Or just work against someone changing stuff by hand.

- It sets up a complete network:
- a bridge,
- some NAT for pretty controlled access to the outside,
- dns, dhcpv4 and dhcpv6 server (lets see if dnsmasq works properly),
- a mock metadata server for the tested plattform.

- It starts the image with kvm.

- It can then run the rest of the tests.

Bastian
--
Ahead warp factor one, Mr. Sulu.
Tomasz Rybak
2018-03-27 22:09:40 UTC
Permalink
Post by Bastian Blank
Post by Tomasz Rybak
What do you mean by that?
I've recently been thinking about how to test our images,
and how to integrate test we've been working on during
the spring; haven't come to any solutions yet, though.
Okay, let's outline my solution, which I didn't test yet. For the
official builds all this stuff runs in kvm instances with nested-vmx on
casulana.
- Each test runs in a disposable Docker environment.
This means we have a completely scripted test environment. This means
less chance that documentation and implementation gets out of sync.
Or just work against someone changing stuff by hand.
- a bridge,
- some NAT for pretty controlled access to the outside,
- dns, dhcpv4 and dhcpv6 server (lets see if dnsmasq works
properly),
- a mock metadata server for the tested plattform.
- It starts the image with kvm.
- It can then run the rest of the tests.
Seems reasonable.


OK, it's time to start thinking about integrating
our tests. It looks like now we have 3 testing solutions:
1. yours
2. one by Thomas (zigo)
3. one that we were working on during last sprint:
https://salsa.debian.org/cloud-team/debian-cloud-tests

Have you written some code already? If so, where is it?

I'll try to gather all the information about our existing
testing solutions and send some summary to the list.

Best regards.
--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C
Bastian Blank
2018-03-28 09:39:48 UTC
Permalink
Post by Tomasz Rybak
OK, it's time to start thinking about integrating
1. yours
Mine is not so much a test solution but a test environment where you can
run a vm with an arbitrary image.
Post by Tomasz Rybak
Have you written some code already? If so, where is it?
I haven't written any code for the test setup yet.

Bastian
--
Intuition, however illogical, is recognized as a command prerogative.
-- Kirk, "Obsession", stardate 3620.7
Tomasz Rybak
2018-04-02 20:16:37 UTC
Permalink
Post by Bastian Blank
Post by Tomasz Rybak
OK, it's time to start thinking about integrating
1. yours
Mine is not so much a test solution but a test environment where you can
run a vm with an arbitrary image.
Our test harness (on which we worked in Seattle) starts
VM in the cloud, checks it from the outside (open ports, etc.),
opens SSH connection, copies code to VM, and runs it (and thus
additional tests).

From your description it looks like we have some common areas
there.
Post by Bastian Blank
Post by Tomasz Rybak
Have you written some code already? If so, where is it?
I haven't written any code for the test setup yet.
Especially because of this - do you want to create your own solution,
or do you want to try to use and maybe extend our existing code?
Our tests (and management code) are written in Python.

Best regards.
--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C
Bastian Blank
2018-08-02 14:25:52 UTC
Permalink
Moin

Sorry, but it took a bit longer than I anticipated for an update.

I finally setup a VM on casulana, and it builds the stuff in our
repository:

https://salsa.debian.org/cloud-team/fai-cloud-images/pipelines/15397
Post by Bastian Blank
Each build runs in a scratch Docker environment via a special
configured GitLab Runner. The builds need access to loop devices,
which is by default not allowed. (Yes, I'm aware that neither Docker,
nor GitLab Runner, have suitable versions in Debian Stretch.)
Right now this VM runs, but it got setup by hand. This obviously needs
to change. However, during testing, I found some problems with it:

The main problem is that all the builds need to run with pretty high
privileges and access shared ressources (mostly the loop devices to
mount a filesystem with). In case of errors or if someone just cancels
the builds, these shared ressources are not cleaned up and will break
the builder sooner or later.

Also this VM will needs at least 32GB of RAM to run a sufficient number
of parallel builds (casulana got 88 cpu cores, we gotta use them) and it
can't shrink between builds.

So I intend to change that setup a bit. It will look more like the
general purpose CI runner for privileged builds I setup for Salsa in the
last weeks. It will use docker-machine to setup and operate scratch VM.
There exists an experimental QEMU driver, which will need some changes
to use CoreOS for the VM. But it uses the user network to communicate
with the VM, so does not need any privileges.

Regards,
Bastian
--
Time is fluid ... like a river with currents, eddies, backwash.
-- Spock, "The City on the Edge of Forever", stardate 3134.0
Thomas Lange
2018-08-02 15:11:30 UTC
Permalink
Hi Bastian,

I'm not sure if you know that we have this new repository:
https://salsa.debian.org/cloud-team/qemu-vm

This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.

I want to avoid that multiple people work on the same thing but do not
know what others are doing. Or do I misinterpret this wrong and yours
is needed for some other things? I'm confused.
--
regards Thomas
Bastian Blank
2018-08-02 16:09:08 UTC
Permalink
Post by Thomas Lange
This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.
Well, I don't want another single use solution. How will you hook this
into the Gitlab CI?

Bastian
--
Killing is stupid; useless!
-- McCoy, "A Private Little War", stardate 4211.8
Bastian Blank
2018-08-02 16:33:26 UTC
Permalink
Post by Bastian Blank
Post by Thomas Lange
This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.
Well, I don't want another single use solution. How will you hook this
into the Gitlab CI?
Sorry, this sounded a bit harsh.

My project tries to use a lot of existing services, and it works pretty
well.

Of cause, if you want to build the VM management by hand, you can do it.
But will it integrate or do you need to replace the whole rest of the
stack as well?

Bastian
--
Another Armenia, Belgium ... the weak innocents who always seem to be
located on a natural invasion route.
-- Kirk, "Errand of Mercy", stardate 3198.4
Thomas Lange
2018-08-08 09:47:48 UTC
Permalink
Post by Bastian Blank
Of cause, if you want to build the VM management by hand, you can do it.
But will it integrate or do you need to replace the whole rest of the
stack as well?
I do not know how we want to build this whole CI stuff. I'm only the
FAI expert. But last year we worked on this build process a little bit
(there was this big picture on the whiteboard) and now something
different showed up. That was the part that confuses me. I guess this
will be a big topic during the next cloud sprint.

I wonder why you worry about 32GB RAM? Casulana has a lot of RAM and
32GB should not cause any trouble to it. Was this the main reason to
build something using docker and CoreOS?
--
regards Thomas
Bastian Blank
2018-08-08 21:14:38 UTC
Permalink
Post by Thomas Lange
I do not know how we want to build this whole CI stuff. I'm only the
FAI expert. But last year we worked on this build process a little bit
(there was this big picture on the whiteboard) and now something
different showed up.
I saw that picture, but it contained no implementation specific
information. Also Salsa did not exist last year, but now it does.
Post by Thomas Lange
I wonder why you worry about 32GB RAM? Casulana has a lot of RAM and
32GB should not cause any trouble to it.
It's just not nice to do it this way. And it limits the scalability.
Post by Thomas Lange
Was this the main reason to
build something using docker and CoreOS?
No, the main reason it isolation. The builds take some global
resources, loop devices, and may not return them in case of some errors.

Bastian
--
Virtue is a relative term.
-- Spock, "Friday's Child", stardate 3499.1
Jimmy Kaplowitz
2018-08-08 21:47:06 UTC
Permalink
Post by Bastian Blank
Post by Thomas Lange
Was this the main reason to
build something using docker and CoreOS?
No, the main reason it isolation. The builds take some global
resources, loop devices, and may not return them in case of some errors.
Google builds their official GCE Debian images inside transient GCE
instances, solely for isolation purposes (they use the Debian cloud team
build tools, probably still bootstrap-vz until we get FAI sufficiently
working). To be clear, nothing about that needs to be in GCE, except for
a few implementation details of their particular build harness. Regular
VMs work fine.

I support the goal of isolation, but transient VMs can serve the same
purpose in a workflow that's more easily portable between casulana,
GitLab CI (I presume?), a personal dev laptop, and anywhere else one
might want to reproduce the flow. Which seems like a win for maximizing
how easy it is for people to hack on this - and also for companies like
Google to converge further with us on tooling.

- Jimmy Kaplowitz
***@debian.org
Bastian Blank
2018-08-09 14:11:19 UTC
Permalink
Post by Jimmy Kaplowitz
I support the goal of isolation, but transient VMs can serve the same
purpose
This setup uses transient VM to do isolation. Isolation is the goal,
transient VM are the way to do it.

On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.
Post by Jimmy Kaplowitz
purpose in a workflow that's more easily portable between casulana,
GitLab CI (I presume?), a personal dev laptop, and anywhere else one
might want to reproduce the flow.
A user will have the following ways to build it:
- Push into the cloud-team repo and the builder on casulana will pick it
up.
- Push into a private repo and the shared builder will pick it up. This
does not yet work due to a missing config option and tags on the
builders.
- Use "gitlab-runner exec docker --docker-privileged $job" to run it
from the checkout on her own Docker instance.
- Use "gitlab-runner exec shell $job" or "gitlab-runner exec ssh $job"
to do the same either on the local machine or another one.
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.

All of that need documentation, including information how to setup a
capable runner. I'm currently trying to convince gitlab.com to change
their config a bit to make the build working without changes.

Regards
Bastian
--
The sooner our happiness together begins, the longer it will last.
-- Miramanee, "The Paradise Syndrome", stardate 4842.6
Jimmy Kaplowitz
2018-08-10 22:44:50 UTC
Permalink
Post by Bastian Blank
On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.
Ideally the workflow would work in any VM host, whether that's qemu,
GCE, GitLab CI, or AWS. Maybe with some platform-specific details in the
build harness, such as to provide relevant bits of configuration, but
that should just set up and invoke the same build flow regardless of
platform.

I notice that a lot of your instructions refer to Docker, though. Are
you talking about running Docker inside transient VMs or using it
instead of transient VMs?
Post by Bastian Blank
- Push into the cloud-team repo and the builder on casulana will pick it
up.
- Push into a private repo and the shared builder will pick it up. This
does not yet work due to a missing config option and tags on the
builders.
- Use "gitlab-runner exec docker --docker-privileged $job" to run it
from the checkout on her own Docker instance.
- Use "gitlab-runner exec shell $job" or "gitlab-runner exec ssh $job"
to do the same either on the local machine or another one.
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.
All of that need documentation, including information how to setup a
capable runner. I'm currently trying to convince gitlab.com to change
their config a bit to make the build working without changes.
My hope is that nobody would need to know or use GitLab (except that
they would git clone our code from Salsa), nor have any write access to
Debian infrastructure whatsoever, in order to reproduce our builds.
Independent verifiability is good, and requiring installation of gitlab
or knowledge of gitlab-runner seems like unnecessary complexity.

I have no problem if we choose to use Salsa and/or casulana for our
builds. For builds which are run via Salsa, I imagine gitlab-runner
would be in the mix as a way to launch the build process. That's fine.

But any Debian user with a laptop and a way to run VMs should be able to
reproduce our builds without installing GitLab or seeking an account
from anyone. (The only exception would be if a provider-specific image
has a reason to require building on their platform, but then that would
still not require a Debian-controlled account.)

- Jimmy Kaplowitz
***@debian.org
Bastian Blank
2018-08-11 13:06:53 UTC
Permalink
Post by Jimmy Kaplowitz
Post by Bastian Blank
On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.
I notice that a lot of your instructions refer to Docker, though. Are
you talking about running Docker inside transient VMs or using it
instead of transient VMs?
I would not say it uses transient VM if it weren't. So it runs Docker
as runtime environment within transient VM.
Post by Jimmy Kaplowitz
My hope is that nobody would need to know or use GitLab (except that
they would git clone our code from Salsa), nor have any write access to
Debian infrastructure whatsoever, in order to reproduce our builds.
I don't know where this idea comes from. While this stack uses Gitlab
services and runs on salsa.d.o, you can always use the inner layers
without it.
Post by Jimmy Kaplowitz
Independent verifiability is good, and requiring installation of gitlab
or knowledge of gitlab-runner seems like unnecessary complexity.
But any Debian user with a laptop and a way to run VMs should be able to
reproduce our builds without installing GitLab or seeking an account
from anyone.
Post by Bastian Blank
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.
Where does it say Gitlab in this? It is just "make".

The whole thing is setup in layers. You can remove several until you
are at the basic build stuff. You don't need to use all of them, it's
just convenient.

Right now this layers are:
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai

I hope that clears up some assumptions.

Bastian
--
If some day we are defeated, well, war has its fortunes, good and bad.
-- Commander Kor, "Errand of Mercy", stardate 3201.7
Luca Filipozzi
2018-08-12 00:51:28 UTC
Permalink
Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?

The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent) and using appropriate parameters to identify
number of CPU/RAM/etc is available on the runner vs what a job needs.
That way, we make best use of casulana.

Thoughts?
--
Luca Filipozzi
Bastian Blank
2018-08-12 07:56:18 UTC
Permalink
Hi Luca
Post by Luca Filipozzi
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine.
Sadly the graphs created by munin are pretty useless to find the real
short term usage. They show for the CD building peak (the only large
peak on the weekly graph) a disk usage of 2kIOPS and 22 CPU.[1] This
would be very far from full load of this machine.
Post by Luca Filipozzi
One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
You create a project and create jobs on it either by using schedules[2]
or with triggers[3].
Post by Luca Filipozzi
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent) and using appropriate parameters to identify
number of CPU/RAM/etc is available on the runner vs what a job needs.
gitlab-runner does not have any resource controls, apart from the number
of parallel jobs. In all the shared installations, resources are
controlled by hard limits per job in form of a VM.

RAM usage looks pretty neglectable, 90% are used by cache. For the CPU
and IO usage there are no usable estimates due to missing data.
Post by Luca Filipozzi
That way, we make best use of casulana.
Thoughts?
I'm not convinced that we need such strong separation, but I've never
looked at the real usage.

Regards,
Bastian

[1]: https://munin.debian.org/debian.org/casulana.debian.org/index.html
[2]: https://docs.gitlab.com/ce/api/pipeline_schedules.html
[3]: https://docs.gitlab.com/ce/api/pipeline_triggers.html
--
We Klingons believe as you do -- the sick should die. Only the strong
should live.
-- Kras, "Friday's Child", stardate 3497.2
Thomas Goirand
2018-08-29 15:07:55 UTC
Permalink
Post by Luca Filipozzi
Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)
What is SGE? A google non-free version of beanstalk?

Cheers,

Thomas Goirand (zigo)
Luca Filipozzi
2018-08-29 15:24:48 UTC
Permalink
Post by Thomas Goirand
Post by Luca Filipozzi
Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)
What is SGE? A google non-free version of beanstalk?
Son of Grid Engine, the open-source fork? of Sun Grid Engine.

Packaged in Debian as 'gridengine-*'

A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.
--
Luca Filipozzi
Paul Dejean
2018-08-29 15:28:27 UTC
Permalink
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.
Post by Luca Filipozzi
Post by Thomas Goirand
Post by Luca Filipozzi
Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we
could/should
Post by Thomas Goirand
Post by Luca Filipozzi
do is turn those 'build jobs' things that can be executed by a
scheduler
Post by Thomas Goirand
Post by Luca Filipozzi
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)
What is SGE? A google non-free version of beanstalk?
Son of Grid Engine, the open-source fork? of Sun Grid Engine.
Packaged in Debian as 'gridengine-*'
A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.
--
Luca Filipozzi
Paul Dejean
2018-08-29 15:34:51 UTC
Permalink
Also would like to add that by using cloud runners we circumvent a lot of
these issues such as all the shared permissions needed, the having to set
up runners by hand (assuming we make an infra as code repo) and so on.
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.
Post by Luca Filipozzi
Post by Thomas Goirand
Post by Luca Filipozzi
Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we
could/should
Post by Thomas Goirand
Post by Luca Filipozzi
do is turn those 'build jobs' things that can be executed by a
scheduler
Post by Thomas Goirand
Post by Luca Filipozzi
such as sge or gitlab-runner. Can we inject non-gitlab-originating
jobs
Post by Thomas Goirand
Post by Luca Filipozzi
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing
jobs
Post by Thomas Goirand
Post by Luca Filipozzi
into SGE (or equivalent)
What is SGE? A google non-free version of beanstalk?
Son of Grid Engine, the open-source fork? of Sun Grid Engine.
Packaged in Debian as 'gridengine-*'
A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.
--
Luca Filipozzi
Luca Filipozzi
2018-08-29 15:47:40 UTC
Permalink
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).

This explains why all the build daemons are on Debian-controlled
hardware.

casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.

Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open source
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.

Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.

I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that far,
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.

Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different legal
jurisdictions, for redundancy purposes.

Colleagues, please elaborate if my explanation above is incorrect in any
way.

Cheers,

Luca
--
Luca Filipozzi
Paul Dejean
2018-08-29 16:07:24 UTC
Permalink
Ok that's understandable. Question #1 who pays for this? A datacenter rack
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.

Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?

There's a huge cost to maintaining this too. From my understanding there's
no mesos cluster setup right now, no kubernettes, no working openstack api.
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.

The idealogy is admirable but we need to define clearly what problem we're
trying to solve. Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin. Is it trying to keep Google
from having access to our private data? If so a good first step would be
stripping access from any Google employees who might be Debian maintainers
(which would be incredibly silly).

Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.

I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating
that
Post by Paul Dejean
machine images could be built in gitlab on google cloud runners that have
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open source
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that far,
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different legal
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in any
way.
Cheers,
Luca
--
Luca Filipozzi
Luca Filipozzi
2018-08-29 16:22:13 UTC
Permalink
(fixing top-posting)
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open source
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that far,
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different legal
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in any
way.
Ok that's understandable. Question #1 who pays for this? A datacenter rack
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.
Debian receives donations, both in-kind and cash.

Debian relies on hosting providers to provide, typically at no cost to
Debian, rack space and network access.

Frequently, this is with univerisities rather than corporations.
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
This is hyperbolic line of inquiry that makes me inclined to not answer
further emails from you.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding there's
no mesos cluster setup right now, no kubernettes, no working openstack api.
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.
Most of Debian's infrastructure uses VMs (ganeti). casulana is an
exception.
Post by Paul Dejean
The idealogy is admirable but we need to define clearly what problem we're
trying to solve.
Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin.
Use multiple clouds simultaneously, avoiding vendor-specific features or
use a reasonable abstraction (fog).
Post by Paul Dejean
Is it trying to keep Google from having access to our private data? If
so a good first step would be stripping access from any Google
employees who might be Debian maintainers (which would be incredibly
silly).
That's not silly. How can Debian claim we have 'control over official
Debian cloud images' if we don't control who can access the various
cloud account by which we publish the images.

An important discussion to be had is whether and how to extend Debian
SSO into the cloud so that when DAM elects to close an account (or when
someone elects to retire), we close _all_ Debian-related access.

I don't view this as silly. I view it as appropriate account lifecycle
management. I encourage DMs to become DDs if they intend to do packaging
work, whether actual packages or cd-image or cd-cloud.
Post by Paul Dejean
Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.
And we are very thankful for the resources that these corporations
provide. That said, it is important to many in the Debian community to
maintain an appropriate distance from them.
Post by Paul Dejean
I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.
And that's what happened over several Sprints and several BoFs.
--
Luca Filipozzi
Paul Dejean
2018-08-29 16:35:56 UTC
Permalink
The confusion arises in that my definition of "control over the server"
differs from yours.

I would say that a Google cloud instance I spin up from my account is "a
server I control."

You would say "you don't control the server Google does. In theory they can
go in and gain access."

So forget my definition. What was the agreed upon definition of a "Debian
controlled server" that was defined at this sprint? And was that definition
written down somewhere?
Post by Luca Filipozzi
(fixing top-posting)
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building
these
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
images going forward. What kicked off this thread was me
demonstrating
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
that
machine images could be built in gitlab on google cloud runners that
have
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open
source
Post by Paul Dejean
Post by Luca Filipozzi
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that
far,
Post by Paul Dejean
Post by Luca Filipozzi
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different
legal
Post by Paul Dejean
Post by Luca Filipozzi
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in
any
Post by Paul Dejean
Post by Luca Filipozzi
way.
Ok that's understandable. Question #1 who pays for this? A datacenter
rack
Post by Paul Dejean
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.
Debian receives donations, both in-kind and cash.
Debian relies on hosting providers to provide, typically at no cost to
Debian, rack space and network access.
Frequently, this is with univerisities rather than corporations.
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
This is hyperbolic line of inquiry that makes me inclined to not answer
further emails from you.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's
Post by Paul Dejean
no mesos cluster setup right now, no kubernettes, no working openstack
api.
Post by Paul Dejean
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.
Most of Debian's infrastructure uses VMs (ganeti). casulana is an
exception.
Post by Paul Dejean
The idealogy is admirable but we need to define clearly what problem
we're
Post by Paul Dejean
trying to solve.
Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin.
Use multiple clouds simultaneously, avoiding vendor-specific features or
use a reasonable abstraction (fog).
Post by Paul Dejean
Is it trying to keep Google from having access to our private data? If
so a good first step would be stripping access from any Google
employees who might be Debian maintainers (which would be incredibly
silly).
That's not silly. How can Debian claim we have 'control over official
Debian cloud images' if we don't control who can access the various
cloud account by which we publish the images.
An important discussion to be had is whether and how to extend Debian
SSO into the cloud so that when DAM elects to close an account (or when
someone elects to retire), we close _all_ Debian-related access.
I don't view this as silly. I view it as appropriate account lifecycle
management. I encourage DMs to become DDs if they intend to do packaging
work, whether actual packages or cd-image or cd-cloud.
Post by Paul Dejean
Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.
And we are very thankful for the resources that these corporations
provide. That said, it is important to many in the Debian community to
maintain an appropriate distance from them.
Post by Paul Dejean
I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.
And that's what happened over several Sprints and several BoFs.
--
Luca Filipozzi
Luca Filipozzi
2018-08-29 16:45:17 UTC
Permalink
The latest such write-up is
https://www.mail-archive.com/debian-***@lists.debian.org/msg03317.html

fine, let's do top-posting

Debian-controlled server is one that is managed by DSA and is,
typically, a physical server hosted by one of our partners.
Post by Paul Dejean
The confusion arises in that my definition of "control over the server"
differs from yours.
I would say that a Google cloud instance I spin up from my account is "a
server I control."
You would say "you don't control the server Google does. In theory they can
go in and gain access."
So forget my definition. What was the agreed upon definition of a "Debian
controlled server" that was defined at this sprint? And was that definition
written down somewhere?
Post by Luca Filipozzi
(fixing top-posting)
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building
these
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
images going forward. What kicked off this thread was me
demonstrating
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
that
machine images could be built in gitlab on google cloud runners that
have
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open
source
Post by Paul Dejean
Post by Luca Filipozzi
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that
far,
Post by Paul Dejean
Post by Luca Filipozzi
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different
legal
Post by Paul Dejean
Post by Luca Filipozzi
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in
any
Post by Paul Dejean
Post by Luca Filipozzi
way.
Ok that's understandable. Question #1 who pays for this? A datacenter
rack
Post by Paul Dejean
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.
Debian receives donations, both in-kind and cash.
Debian relies on hosting providers to provide, typically at no cost to
Debian, rack space and network access.
Frequently, this is with univerisities rather than corporations.
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
This is hyperbolic line of inquiry that makes me inclined to not answer
further emails from you.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's
Post by Paul Dejean
no mesos cluster setup right now, no kubernettes, no working openstack
api.
Post by Paul Dejean
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.
Most of Debian's infrastructure uses VMs (ganeti). casulana is an
exception.
Post by Paul Dejean
The idealogy is admirable but we need to define clearly what problem
we're
Post by Paul Dejean
trying to solve.
Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin.
Use multiple clouds simultaneously, avoiding vendor-specific features or
use a reasonable abstraction (fog).
Post by Paul Dejean
Is it trying to keep Google from having access to our private data? If
so a good first step would be stripping access from any Google
employees who might be Debian maintainers (which would be incredibly
silly).
That's not silly. How can Debian claim we have 'control over official
Debian cloud images' if we don't control who can access the various
cloud account by which we publish the images.
An important discussion to be had is whether and how to extend Debian
SSO into the cloud so that when DAM elects to close an account (or when
someone elects to retire), we close _all_ Debian-related access.
I don't view this as silly. I view it as appropriate account lifecycle
management. I encourage DMs to become DDs if they intend to do packaging
work, whether actual packages or cd-image or cd-cloud.
Post by Paul Dejean
Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.
And we are very thankful for the resources that these corporations
provide. That said, it is important to many in the Debian community to
maintain an appropriate distance from them.
Post by Paul Dejean
I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.
And that's what happened over several Sprints and several BoFs.
--
Luca Filipozzi
--
Luca Filipozzi
Paul Dejean
2018-08-29 16:53:49 UTC
Permalink
Ok well first of all, I would have liked it if someone told me from the get
go "that's neat but official Debian vagrant boxes will never be built on
Google cloud because it's against our policy."

Instead what happened is people started talking about integrating SSO with
google cloud and similar which left an entirely different impression on
what directions were being considered!

Second of all I imagine that AMIs and Google cloud images and other offical
proprietary format debian images are exempt from this rule, since they can
only really be built from within the appropriate company's cloud services.
Post by Luca Filipozzi
The latest such write-up is
fine, let's do top-posting
Debian-controlled server is one that is managed by DSA and is,
typically, a physical server hosted by one of our partners.
Post by Paul Dejean
The confusion arises in that my definition of "control over the server"
differs from yours.
I would say that a Google cloud instance I spin up from my account is "a
server I control."
You would say "you don't control the server Google does. In theory they
can
Post by Paul Dejean
go in and gain access."
So forget my definition. What was the agreed upon definition of a "Debian
controlled server" that was defined at this sprint? And was that
definition
Post by Paul Dejean
written down somewhere?
Post by Luca Filipozzi
(fixing top-posting)
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for
building
Post by Paul Dejean
Post by Luca Filipozzi
these
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
images going forward. What kicked off this thread was me
demonstrating
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
that
machine images could be built in gitlab on google cloud runners
that
Post by Paul Dejean
Post by Luca Filipozzi
have
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that
our
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
packages, our cd images, and (by extension) our cloud images
should be
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
built on hardware that is owned and operated by Debian. VMs
provided by
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and
cloud-image
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to
use
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
of Google storage for git lfs, there are members of the community
that
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
would like to see Debian choose options that (a) make use of open
source
Post by Paul Dejean
Post by Luca Filipozzi
software and (b) make us less rather than more reliant on the good
will
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled
discussion
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build
jobs
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
prepared. gitlab doesn't use gridengine; we may not need to go that
far,
Post by Paul Dejean
Post by Luca Filipozzi
but we may wish to introduce some kind of semaphor between gitlab
jobs
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
and cd-image jobs to allow all of casulana to be used by the
cd-image
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
scripts.
Finally, while salsa is using Google storage for git lfs, the
ability
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
for Google to tamper with the objects in git in an undetectable
way is
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see
several
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Post by Luca Filipozzi
third-party storage solutions to be employed, ideally in different
legal
Post by Paul Dejean
Post by Luca Filipozzi
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect
in
Post by Paul Dejean
Post by Luca Filipozzi
any
Post by Paul Dejean
Post by Luca Filipozzi
way.
Ok that's understandable. Question #1 who pays for this? A datacenter
rack
Post by Paul Dejean
costs money. And whoever owns the data center has physical access.
The
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
actual computer hardware costs money not just on a one time basis
either.
Post by Paul Dejean
Post by Luca Filipozzi
Debian receives donations, both in-kind and cash.
Debian relies on hosting providers to provide, typically at no cost to
Debian, rack space and network access.
Frequently, this is with univerisities rather than corporations.
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
This is hyperbolic line of inquiry that makes me inclined to not answer
further emails from you.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's
Post by Paul Dejean
no mesos cluster setup right now, no kubernettes, no working
openstack
Post by Paul Dejean
Post by Luca Filipozzi
api.
Post by Paul Dejean
Creating a private Debian cloud is a lot of work. Not creating a
private
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Debian cloud and just having a bunch of ad hoc servers is probably
even
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
more work in the long run.
Most of Debian's infrastructure uses VMs (ganeti). casulana is an
exception.
Post by Paul Dejean
The idealogy is admirable but we need to define clearly what problem
we're
Post by Paul Dejean
trying to solve.
Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin.
Use multiple clouds simultaneously, avoiding vendor-specific features
or
Post by Paul Dejean
Post by Luca Filipozzi
use a reasonable abstraction (fog).
Post by Paul Dejean
Is it trying to keep Google from having access to our private data?
If
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
so a good first step would be stripping access from any Google
employees who might be Debian maintainers (which would be incredibly
silly).
That's not silly. How can Debian claim we have 'control over official
Debian cloud images' if we don't control who can access the various
cloud account by which we publish the images.
An important discussion to be had is whether and how to extend Debian
SSO into the cloud so that when DAM elects to close an account (or when
someone elects to retire), we close _all_ Debian-related access.
I don't view this as silly. I view it as appropriate account lifecycle
management. I encourage DMs to become DDs if they intend to do
packaging
Post by Paul Dejean
Post by Luca Filipozzi
work, whether actual packages or cd-image or cd-cloud.
Post by Paul Dejean
Is it trying to avoid corporate influence? Amazon is already
contributing
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
resources (i think might be remembering wrong) and there were plans
for
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
Google to join in soon as was mentioned in this thread.
And we are very thankful for the resources that these corporations
provide. That said, it is important to many in the Debian community to
maintain an appropriate distance from them.
Post by Paul Dejean
I'm not trying to knock idealogy, it's what makes Debian not Red
Hat. All
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I'm saying is that we need to define what exactly the rules and
goals are
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
here so we know what there is to work with.
And that's what happened over several Sprints and several BoFs.
--
Luca Filipozzi
--
Luca Filipozzi
Paul Dejean
2018-08-29 17:06:08 UTC
Permalink
For reference here's the thread I'm referring to, it's been a while:

https://lists.debian.org/debian-cloud/2018/05/msg00007.html
Ok well first of all, I would have liked it if someone told me from the get go "that's neat but official Debian vagrant boxes will never be built on Google cloud because it's against our policy."
Instead what happened is people started talking about integrating SSO with google cloud and similar which left an entirely different impression on what directions were being considered!
Second of all I imagine that AMIs and Google cloud images and other offical proprietary format debian images are exempt from this rule, since they can only really be built from within the appropriate company's cloud services.
Post by Luca Filipozzi
The latest such write-up is
fine, let's do top-posting
Debian-controlled server is one that is managed by DSA and is,
typically, a physical server hosted by one of our partners.
Post by Paul Dejean
The confusion arises in that my definition of "control over the server"
differs from yours.
I would say that a Google cloud instance I spin up from my account is "a
server I control."
You would say "you don't control the server Google does. In theory they can
go in and gain access."
So forget my definition. What was the agreed upon definition of a "Debian
controlled server" that was defined at this sprint? And was that definition
written down somewhere?
Post by Luca Filipozzi
(fixing top-posting)
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building
these
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
images going forward. What kicked off this thread was me
demonstrating
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
that
machine images could be built in gitlab on google cloud runners that
have
Post by Paul Dejean
Post by Luca Filipozzi
Post by Paul Dejean
nested virt support.
Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open
source
Post by Paul Dejean
Post by Luca Filipozzi
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that
far,
Post by Paul Dejean
Post by Luca Filipozzi
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different
legal
Post by Paul Dejean
Post by Luca Filipozzi
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in
any
Post by Paul Dejean
Post by Luca Filipozzi
way.
Ok that's understandable. Question #1 who pays for this? A datacenter
rack
Post by Paul Dejean
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.
Debian receives donations, both in-kind and cash.
Debian relies on hosting providers to provide, typically at no cost to
Debian, rack space and network access.
Frequently, this is with univerisities rather than corporations.
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
This is hyperbolic line of inquiry that makes me inclined to not answer
further emails from you.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's
Post by Paul Dejean
no mesos cluster setup right now, no kubernettes, no working openstack
api.
Post by Paul Dejean
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.
Most of Debian's infrastructure uses VMs (ganeti). casulana is an
exception.
Post by Paul Dejean
The idealogy is admirable but we need to define clearly what problem
we're
Post by Paul Dejean
trying to solve.
Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin.
Use multiple clouds simultaneously, avoiding vendor-specific features or
use a reasonable abstraction (fog).
Post by Paul Dejean
Is it trying to keep Google from having access to our private data? If
so a good first step would be stripping access from any Google
employees who might be Debian maintainers (which would be incredibly
silly).
That's not silly. How can Debian claim we have 'control over official
Debian cloud images' if we don't control who can access the various
cloud account by which we publish the images.
An important discussion to be had is whether and how to extend Debian
SSO into the cloud so that when DAM elects to close an account (or when
someone elects to retire), we close _all_ Debian-related access.
I don't view this as silly. I view it as appropriate account lifecycle
management. I encourage DMs to become DDs if they intend to do packaging
work, whether actual packages or cd-image or cd-cloud.
Post by Paul Dejean
Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.
And we are very thankful for the resources that these corporations
provide. That said, it is important to many in the Debian community to
maintain an appropriate distance from them.
Post by Paul Dejean
I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.
And that's what happened over several Sprints and several BoFs.
--
Luca Filipozzi
--
Luca Filipozzi
Jimmy Kaplowitz
2018-08-29 18:40:31 UTC
Permalink
Post by Paul Dejean
Second of all I imagine that AMIs and Google cloud images and other offical
proprietary format debian images are exempt from this rule, since they can
only really be built from within the appropriate company's cloud services.
Google's image format is publicly documented, quite standard (raw disk
image inside a tarball), and can be built just fine on casulana or
anywhere else. Google does their builds inside GCE merely as a
convenient way for them to manage transient build VMs, not out of
technical necessity.

I have no idea about the equivalent AMI answer.

- Jimmy Kaplowitz
***@debian.org
Bastian Blank
2018-08-29 19:03:42 UTC
Permalink
Post by Jimmy Kaplowitz
Post by Paul Dejean
Second of all I imagine that AMIs and Google cloud images and other offical
proprietary format debian images are exempt from this rule, since they can
only really be built from within the appropriate company's cloud services.
I really have no idea where this idea comes from. Of cause all of them
can be built outside of the respective infrastructures. The build
process simply does not care where it runs.

The upload process to create a usable image inside the cloud systems
usualy can work from everywhere. This is true for GCE (and properly
documented) and for Azure (and not documented).
Post by Jimmy Kaplowitz
I have no idea about the equivalent AMI answer.
The sole exception is AWS, where the images called AMI are built from
snapshots of existing disks, which can only be created by attaching
volumes to existing VM.

Bastian
--
Phasers locked on target, Captain.
Paul Dejean
2018-08-29 19:55:02 UTC
Permalink
The misconception came from my lack of experience with non aws cloud
providers. My bad.
Post by Paul Dejean
Post by Jimmy Kaplowitz
Post by Paul Dejean
Second of all I imagine that AMIs and Google cloud images and other
offical
Post by Jimmy Kaplowitz
Post by Paul Dejean
proprietary format debian images are exempt from this rule, since they
can
Post by Jimmy Kaplowitz
Post by Paul Dejean
only really be built from within the appropriate company's cloud
services.
I really have no idea where this idea comes from. Of cause all of them
can be built outside of the respective infrastructures. The build
process simply does not care where it runs.
The upload process to create a usable image inside the cloud systems
usualy can work from everywhere. This is true for GCE (and properly
documented) and for Azure (and not documented).
Post by Jimmy Kaplowitz
I have no idea about the equivalent AMI answer.
The sole exception is AWS, where the images called AMI are built from
snapshots of existing disks, which can only be created by attaching
volumes to existing VM.
Bastian
--
Phasers locked on target, Captain.
Marcin Kulisz
2018-08-30 20:54:37 UTC
Permalink
Post by Luca Filipozzi
(fixing top-posting)
Luca,
Thx a lot for your email explaining our Pov on why we want to run builds on
Casulana as wall as thx for your patience ;-)
--
|_|0|_| |
|_|_|0| "Panta rei" |
|0|0|0| -------- kuLa -------- |

gpg --keyserver pgp.mit.edu --recv-keys 0x686930DD58C338B3
3DF1 A4DF C732 4688 38BC F121 6869 30DD 58C3 38B3
Thomas Goirand
2018-08-30 11:44:12 UTC
Permalink
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating
that machine images could be built in gitlab on google cloud runners
that have nested virt support.
Let me put it the other way around.

Why should we build on external machines, when we do have all the needed
hardware at our disposal? I don't get it...
Also would like to add that by using cloud runners we circumvent a lot
of these issues such as all the shared permissions needed, the having
to set up runners by hand (assuming we make an infra as code repo) and
so on.
You may as well use Windows, so you don't need to build your own
operating system.
I view these as settled discussion
So do I. Can we move on?
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
Funny, that's more or less what I said in my talk at Debconf18:
https://meetings-archive.debian.net/pub/debian-meetings/2018/DebConf18/2018-07-30/server-freedom-why-choosing-the-cloud-op.webm

I very much agree that having more things under our own control gives
more freedom. However, what counts is using free software. GCE clearly
isn't free software.
There's a huge cost to maintaining this too. From my understanding
there's no mesos cluster setup right now, no kubernettes, no working
openstack api. Creating a private Debian cloud is a lot of work. Not
creating a private Debian cloud and just having a bunch of ad hoc
servers is probably even more work in the long run.
I offered multiple times to the DSA team to give some help setting-up a
full OpenStack cluster for the Debian infra. Maybe this will happen some
day. I'm currently writing a software [1] to do this kind of setup fully
automatically using PXE boot and puppet. Hopefully, it will help.
All I'm saying is that we need to define what exactly the rules and
goals are here so we know what there is to work with.
We do have simple rules: everything should be built on Debian infra.
Second of all I imagine that AMIs and Google cloud images and other
offical proprietary format debian images are exempt from this rule,
since they can only really be built from within the appropriate
company's cloud services.
Sorry, but that's plain wrong. Take the official OpenStack image, upload
it to AWS, and there you go, you have a working official AWS image. That
image doesn't need to be built on any foreign hardware, it builds fine
on your own laptop.
The misconception came from my lack of experience with non aws cloud
providers. My bad.
Well, it's wrong as well for AWS, IMO.

Cheers,

Thomas Goirand (zigo)

[1] https://packages.debian.org/sid/openstack-cluster-installer
Paul Dejean
2018-08-30 12:57:47 UTC
Permalink
Ok.

Casulana's processors are haswell and to the best of my knowledge support
nested virtualization. So it should be possible to run a gitlab runner VM
on Casulana that can do these builds.

There might be some tinkering required on the software side to get nested
virtualization working though.
Post by Thomas Goirand
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating
that machine images could be built in gitlab on google cloud runners
that have nested virt support.
Let me put it the other way around.
Why should we build on external machines, when we do have all the needed
hardware at our disposal? I don't get it...
Post by Paul Dejean
Also would like to add that by using cloud runners we circumvent a lot
of these issues such as all the shared permissions needed, the having
to set up runners by hand (assuming we make an infra as code repo) and
so on.
You may as well use Windows, so you don't need to build your own
operating system.
Post by Paul Dejean
I view these as settled discussion
So do I. Can we move on?
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
https://meetings-archive.debian.net/pub/debian-meetings/2018/DebConf18/2018-07-30/server-freedom-why-choosing-the-cloud-op.webm
I very much agree that having more things under our own control gives
more freedom. However, what counts is using free software. GCE clearly
isn't free software.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's no mesos cluster setup right now, no kubernettes, no working
openstack api. Creating a private Debian cloud is a lot of work. Not
creating a private Debian cloud and just having a bunch of ad hoc
servers is probably even more work in the long run.
I offered multiple times to the DSA team to give some help setting-up a
full OpenStack cluster for the Debian infra. Maybe this will happen some
day. I'm currently writing a software [1] to do this kind of setup fully
automatically using PXE boot and puppet. Hopefully, it will help.
Post by Paul Dejean
All I'm saying is that we need to define what exactly the rules and
goals are here so we know what there is to work with.
We do have simple rules: everything should be built on Debian infra.
Post by Paul Dejean
Second of all I imagine that AMIs and Google cloud images and other
offical proprietary format debian images are exempt from this rule,
since they can only really be built from within the appropriate
company's cloud services.
Sorry, but that's plain wrong. Take the official OpenStack image, upload
it to AWS, and there you go, you have a working official AWS image. That
image doesn't need to be built on any foreign hardware, it builds fine
on your own laptop.
Post by Paul Dejean
The misconception came from my lack of experience with non aws cloud
providers. My bad.
Well, it's wrong as well for AWS, IMO.
Cheers,
Thomas Goirand (zigo)
[1] https://packages.debian.org/sid/openstack-cluster-installer
Paul Dejean
2018-08-30 12:59:01 UTC
Permalink
Oh silly me this was already done in the initial post it's just a matter of
automating it...
Post by Paul Dejean
Ok.
Casulana's processors are haswell and to the best of my knowledge support
nested virtualization. So it should be possible to run a gitlab runner VM
on Casulana that can do these builds.
There might be some tinkering required on the software side to get nested
virtualization working though.
Post by Thomas Goirand
Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating
that machine images could be built in gitlab on google cloud runners
that have nested virt support.
Let me put it the other way around.
Why should we build on external machines, when we do have all the needed
hardware at our disposal? I don't get it...
Post by Paul Dejean
Also would like to add that by using cloud runners we circumvent a lot
of these issues such as all the shared permissions needed, the having
to set up runners by hand (assuming we make an infra as code repo) and
so on.
You may as well use Windows, so you don't need to build your own
operating system.
Post by Paul Dejean
I view these as settled discussion
So do I. Can we move on?
Post by Paul Dejean
Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?
https://meetings-archive.debian.net/pub/debian-meetings/2018/DebConf18/2018-07-30/server-freedom-why-choosing-the-cloud-op.webm
I very much agree that having more things under our own control gives
more freedom. However, what counts is using free software. GCE clearly
isn't free software.
Post by Paul Dejean
There's a huge cost to maintaining this too. From my understanding
there's no mesos cluster setup right now, no kubernettes, no working
openstack api. Creating a private Debian cloud is a lot of work. Not
creating a private Debian cloud and just having a bunch of ad hoc
servers is probably even more work in the long run.
I offered multiple times to the DSA team to give some help setting-up a
full OpenStack cluster for the Debian infra. Maybe this will happen some
day. I'm currently writing a software [1] to do this kind of setup fully
automatically using PXE boot and puppet. Hopefully, it will help.
Post by Paul Dejean
All I'm saying is that we need to define what exactly the rules and
goals are here so we know what there is to work with.
We do have simple rules: everything should be built on Debian infra.
Post by Paul Dejean
Second of all I imagine that AMIs and Google cloud images and other
offical proprietary format debian images are exempt from this rule,
since they can only really be built from within the appropriate
company's cloud services.
Sorry, but that's plain wrong. Take the official OpenStack image, upload
it to AWS, and there you go, you have a working official AWS image. That
image doesn't need to be built on any foreign hardware, it builds fine
on your own laptop.
Post by Paul Dejean
The misconception came from my lack of experience with non aws cloud
providers. My bad.
Well, it's wrong as well for AWS, IMO.
Cheers,
Thomas Goirand (zigo)
[1] https://packages.debian.org/sid/openstack-cluster-installer
Noah Meyerhans
2018-08-09 16:12:53 UTC
Permalink
Post by Jimmy Kaplowitz
Post by Bastian Blank
No, the main reason it isolation. The builds take some global
resources, loop devices, and may not return them in case of some errors.
Google builds their official GCE Debian images inside transient GCE
instances, solely for isolation purposes (they use the Debian cloud team
build tools, probably still bootstrap-vz until we get FAI sufficiently
working). To be clear, nothing about that needs to be in GCE, except for
a few implementation details of their particular build harness. Regular
VMs work fine.
At the Microsoft-hosted cloud sprint I proposed using cloud-provider VMs
for builds targeting that provider. This is not because of any
provider-specific behavior, but rather because the cloud providers
provide all the isolation, resource management, and automation hooks
that we could ask for. I still maintain that it's the better approach,
but was told at the time that the builds need to happen on Debian-owned
hardware, and that we had users specifically insisting on this. I'm not
convinced by that argument, nor have I heard anything from AWS users
expressing concern that the images are being built on AWS. Meanwhile I
have been building all the AWS images using small (t2.micro), transient
EC2 instances and a small shell script to handle the VM lifecycle and
have managed to completely avoid the complexity of that giant whiteboard
drawing from the sprint...

https://salsa.debian.org/noahm/ec2-image-builder/blob/master/bin/launch-fai-builder.sh
Post by Jimmy Kaplowitz
I support the goal of isolation, but transient VMs can serve the same
purpose in a workflow that's more easily portable between casulana,
GitLab CI (I presume?), a personal dev laptop, and anywhere else one
might want to reproduce the flow. Which seems like a win for maximizing
how easy it is for people to hack on this - and also for companies like
Google to converge further with us on tooling.
Indeed. Some time ago, I posted on my blog about how users can use our
build tooling to generate their own custom AMIs that derive from our FAI
configs. The workflow is identical, because it uses common
infrastructure. A build process that relies on custom Debian
infrastructure is not going to be useful to users, meaning they'll have
to use a different workflow to build images, with different bugs, edge
cases, failure modes, etc. (Note that the post was written before the
above mentioned small shell script was written, so there are more steps.
I should update that post...)

https://noah.meyerhans.us/blog/2017/02/10/using-fai-to-customize-and-build-your-own-cloud-images/

noah
Luca Filipozzi
2018-08-10 23:34:12 UTC
Permalink
It would be so very good to stop circling the two main issues that recur
frequently: (1) building on casulana (debian-owned and operated
infrastructure) and (2) use of FAI. The fact that we keep circling (and
I get to be that mostly-outside-looking-in-guy in a do-ocracy world) on
these two issues is, in my view, a problem.

(Not picking on Noah, just happen to reply to his email :) ).
--
Luca Filipozzi
Jimmy Kaplowitz
2018-08-10 23:53:15 UTC
Permalink
Post by Luca Filipozzi
It would be so very good to stop circling the two main issues that recur
frequently: (1) building on casulana (debian-owned and operated
infrastructure) and (2) use of FAI. The fact that we keep circling (and
I get to be that mostly-outside-looking-in-guy in a do-ocracy world) on
these two issues is, in my view, a problem.
(Not picking on Noah, just happen to reply to his email :) ).
I think you're overestimating how much disagreement there is.

We've already agreed that Debian should be able to build its images on
casulana. Noah seems to feel it adds a barrier to getting the system in
place (empirically true, in my experience) and that it adds a risk of
making a Debian-specific locked-in workflow (which is what I'm pushing
back on to Bastian). It's clear that he'd prefer provider-specific build
platforms, but he said his piece at the sprint - he was just summarizing
that in this thread.

At this point, is anyone truly arguing against doing official builds on
Debian infrastructure if someone manages to put in the time and energy
to make it work in a way that people can easily reproduce
independently of Debian infrastructure? I don't think so.

Similarly, I think we're all agreed on using FAI, though I haven't
checked to see what's in Bastian's implementation.

Assuming I'm not missing something there, the main question is about how
to get the FAI builds set up and invoked, with a secondary question of
fixing the FAI builds on platforms where they don't yet work. Noah's
already building with FAI for AWS.

- Jimmy Kaplowitz
***@debian.org
Luca Filipozzi
2018-08-11 02:14:04 UTC
Permalink
Post by Jimmy Kaplowitz
I think you're overestimating how much disagreement there is.
I am happy to be wrong.
--
Luca Filipozzi
Thomas Goirand
2018-08-29 15:18:25 UTC
Permalink
Post by Jimmy Kaplowitz
Similarly, I think we're all agreed on using FAI, though I haven't
checked to see what's in Bastian's implementation.
It looks like the only person that expressed himself against FAI was
myself, and like this since the decision (which was taken when I wasn't
present in the sprint, 2 years ago). However, I always respected the
decision and always decided to just shut my mouth. I never start that
topic either. That being said, if you ask me, you already know the
answer, and you will inevitably trigger frustration, so just don't ask.

The other thing is that, at Debconf, I said I wouldn't do FAI. That's
not because I don't want to, it's just because I don't have the time
(and energy) to do it. However, and that I didn't say it: I *do* intend
to contribute and maintain the OpenStack image when the solution is in
place, and make sure we have no regression compared to the current images.

One thing I intend to contribute, is a list of things that FAI *must*
comply with when building OpenStack images. This list, I discovered it
when maintaining the image, and some stuff were kind of surprising. This
can be done simply by reading openstack-debian-images with the
background of bugs that were closed when writing it.

Hopefully, I'll get the document done before the sprint (but currently,
many OpenStack stuff are dragging me... like OpenStack Rocky released
this week, and more $work stuff...).

Cheers,

Thomas Goirand (zigo)
Marcin Kulisz
2018-08-08 20:29:22 UTC
Permalink
Post by Bastian Blank
Moin
Sorry, but it took a bit longer than I anticipated for an update.
I finally setup a VM on casulana, and it builds the stuff in our
https://salsa.debian.org/cloud-team/fai-cloud-images/pipelines/15397
Looks nice, thx for doing it.
Post by Bastian Blank
Post by Bastian Blank
Each build runs in a scratch Docker environment via a special
configured GitLab Runner. The builds need access to loop devices,
which is by default not allowed. (Yes, I'm aware that neither Docker,
nor GitLab Runner, have suitable versions in Debian Stretch.)
Right now this VM runs, but it got setup by hand. This obviously needs
Where is this VM running (what host)? Is it on salsa or somewhere else?
--
|_|0|_| |
|_|_|0| "Panta rei" |
|0|0|0| -------- kuLa -------- |

gpg --keyserver pgp.mit.edu --recv-keys 0x686930DD58C338B3
3DF1 A4DF C732 4688 38BC F121 6869 30DD 58C3 38B3
Bastian Blank
2018-08-08 21:07:12 UTC
Permalink
Post by Marcin Kulisz
Where is this VM running (what host)? Is it on salsa or somewhere else?
It runs on casulana.d.o.

Bastian
--
There is a multi-legged creature crawling on your shoulder.
-- Spock, "A Taste of Armageddon", stardate 3193.9
Bastian Blank
2018-08-08 21:38:45 UTC
Permalink
Post by Bastian Blank
Post by Marcin Kulisz
Where is this VM running (what host)? Is it on salsa or somewhere else?
It runs on casulana.d.o.
It should be possible to even build the images on the normal shared
CI runners, but this needs a small config change. This would allow
users to fork the repo, have it build the images without extra changes.
Only the official images would be build on casulana.

Bastian
--
The joys of love made her human and the agonies of love destroyed her.
-- Spock, "Requiem for Methuselah", stardate 5842.8
Bastian Blank
2018-08-29 09:54:48 UTC
Permalink
Hi

Another update on the way to get usable builds out of our stuff. As
usual, if something is unclear, please ask. I use a lot of that stuff
daily, so I know how it works, but others will not.

Maybe someone also wants to volunteer for some of the tasks I outline.

## What have been done

Each push to the main repo builds a bunch of images using Sid. Such
builds run in the shared runners that the Salsa team provides for all
projects. They take about 15 minutes to finish, but I would say that's
acceptable. This means each push will provide some output that can be
tested.

See for example
https://salsa.debian.org/cloud-team/debian-cloud-images/pipelines/17691

The resulting files can be downloaded from the job page (until expired)
https://salsa.debian.org/cloud-team/debian-cloud-images/-/jobs/41852/artifacts/browse

People can request a full set of builds on their commits by running a
new pipeline on the branch. This can be handy for fixes that only
affect a distributions that is not Sid.

## What is awaiting finishing touches

### Official builds

Official builds, which can be uploaded and released, will be built by
the runner on casulana.d.o. This runner is selected on specially
triggered official builds.

This runner will only fetch builds on protected branches in the main
repo. Only people with Maintainer and Owner rights will be allowed to
push into those branches. This restricts who can actually run code on
it.

See
https://salsa.debian.org/cloud-team/debian-cloud-images/merge_requests/10

### EFI

EFI is a thing and we should support it.

See
https://salsa.debian.org/cloud-team/debian-cloud-images/merge_requests/9

## What needs to be done

### Documentation

I know I'm bad in writing documentation. Could someone please help with
that?

### Daily builds

After the official build stuff is merged, we just need to trigger it
daily with the correct version to use.

I would really prefer to do builds and later uploads daily. Just
because. We may find problems early. And it usualy does not really
matter if we do a bit more traffic. We do that for Azure already and
Ubuntu seems to do it for all the clouds.

Also we need to clean the images up regularly.

This need lingering enabled for the cloud-build user on casulana.d.o,
some services to start the gitlab-runner and the triggers.

### GCE upload

Using the Google SDK is cumbersome, as it is neither provided in Debian,
nor in properly usable format. But a small script using libcloud 2 does
the uploads pretty well.

For the current state see
https://salsa.debian.org/waldi/debian-cloud-images/blob/gce-publish/bin/upload-gce

Zach: we should talk about where we can publish images.

### Azure upload

That is something I need to do. We are currently waiting for the
Marketplace team from Microsoft to get the new way of managing images up
and running.

### EC2 upload

No idea yet. There is a script in our repo to create a single image.
But I see nothing that would handle the whole pipeline with replicating
and registering AMI in all the regions.

### OpenStack image

Needs proper cloud-init config. And someone really could handle
cloud-init updates.

### Publishing of images on our own

We want to publish images somewhere on our own mirrors. This can be
added as additional jobs to the pipeline that get copies of all the
artifacts of the previous steps and can copy them where they should go.

### Release process

Puh. I created a repo for this task
(https://salsa.debian.org/cloud-team/debian-cloud-images-release), but
nothing more yet.

Regards,
Bastian
--
Beam me up, Scotty, there's no intelligent life down here!
Thomas Lange
2018-08-29 10:32:21 UTC
Permalink
Post by Bastian Blank
Official builds, which can be uploaded and released, will be built by
the runner on casulana.d.o. This runner is selected on specially
triggered official builds.
Is this a runner on casulana or a a runner on salsa that login into
casulana as user cloud-build and runs the build script there?
--
regards Thomas
Steffen Möller
2018-08-29 10:41:47 UTC
Permalink
Post by Thomas Lange
Post by Bastian Blank
Official builds, which can be uploaded and released, will be built by
the runner on casulana.d.o. This runner is selected on specially
triggered official builds.
Is this a runner on casulana or a a runner on salsa that login into
casulana as user cloud-build and runs the build script there?
Is that compatible with an outreach to our Blends that have some serious
interest
to have images of theirs prepared in say AWS? Debian-Med comes to mind.
Those images would have cloud-relevant additions to regular Debian setups,
like the mounting of directories with public data that AWS offers and
that Debian's
software knows how to use.

Cheers,

Steffen
Bastian Blank
2018-08-29 11:24:40 UTC
Permalink
Post by Steffen Möller
Is that compatible with an outreach to our Blends that have some serious
interest
to have images of theirs prepared in say AWS? Debian-Med comes to mind.
I would like to get our own stuff up and running first. In theory we
can add images as we like, so that the vendors don't kill us.
Post by Steffen Möller
Those images would have cloud-relevant additions to regular Debian setups,
like the mounting of directories with public data that AWS offers and
that Debian's
software knows how to use.
I don't think this makes much sense. You need to do setup software and
data stuff anyway. And if you start a lot of identical systems, you
need to create a specialized images anyway.

Regards,
Bastian
--
What kind of love is that? Not to be loved; never to have shown love.
-- Commissioner Nancy Hedford, "Metamorphosis",
stardate 3219.8
Bastian Blank
2018-08-29 11:20:43 UTC
Permalink
Post by Thomas Lange
Post by Bastian Blank
Official builds, which can be uploaded and released, will be built by
the runner on casulana.d.o. This runner is selected on specially
triggered official builds.
Is this a runner on casulana or a a runner on salsa that login into
casulana as user cloud-build and runs the build script there?
The runner runs on casulana and pulls jobs from salsa. We don't have
any direct login to the cloud-build user.

The running binaries are currently hand built by me and run in a tmux
session started by me.

Regards,
Bastian
--
Death, when unnecessary, is a tragic thing.
-- Flint, "Requiem for Methuselah", stardate 5843.7
Bastian Blank
2018-09-15 19:54:04 UTC
Permalink
Moin

I'm a bit sad that no response showed up to stuff that actually means
work. Why do we want to meet, if no one seems to be actually prepared
to do the work?
Post by Bastian Blank
## What needs to be done
### Documentation
I know I'm bad in writing documentation. Could someone please help with
that?
So, no documentation?
Post by Bastian Blank
### Azure upload
That is something I need to do. We are currently waiting for the
Marketplace team from Microsoft to get the new way of managing images up
and running.
They came back with something. I'm planing how to handle it.
Post by Bastian Blank
### EC2 upload
No idea yet. There is a script in our repo to create a single image.
But I see nothing that would handle the whole pipeline with replicating
and registering AMI in all the regions.
Shouldn't we have something already? Even the wiki does not provide
information.

To test what I found, I need more right on our AWS account, but the
people responsible for them seems to have vanished.
Post by Bastian Blank
### OpenStack image
Needs proper cloud-init config. And someone really could handle
cloud-init updates.
There where three or four people wanting to actually handle cloud-init.
Did anything happen?

Regards,
Bastian
--
Respect is a rational process
-- McCoy, "The Galileo Seven", stardate 2822.3
Noah Meyerhans
2018-09-18 05:25:40 UTC
Permalink
Post by Bastian Blank
Post by Bastian Blank
### EC2 upload
No idea yet. There is a script in our repo to create a single image.
But I see nothing that would handle the whole pipeline with replicating
and registering AMI in all the regions.
Shouldn't we have something already? Even the wiki does not provide
information.
The scripts are at https://salsa.debian.org/noahm/ec2-image-builder,
where they've always been (well, accounting for the alioth->salsa
migration). Check the list archives if you missed it previously.

Please remember that the stretch EC2 AMIs have been updated pretty much
constantly in all AWS regions (with every kernel or core package update,
and every point release) since stretch was released. I consider
publication of our FAI-based images to AWS to be pretty much a solved
problem.

Note that I don't really consider these scripts complete, in that they
don't necessarily all have as nice a UI as they might, and they could
all be a little more flexible. But they work for me, and I'm not really
inclined to work on them further as it's clear that some others don't
like how I've solved the problem and the inability to reach a
satisfactory conclusion to the debates around how/where to build images
has left me without motivation for further involvement...

noah

Loading...