Building cloud images using Debian infrastructure

Post by Bastian Blank
- Tests for images.
I'm not sure if scheduled builds should perform detail tests on all
platforms, or if this should be restricted to releases and explicit
triggers.

I thought about tests again. I think we should do a test with a mocked
metadata service for each build, so we know that the image is able to
provision. This should take no more than two minutes.

The EC2 metadata protocol is easy and can be served from a static
server. GCE seems to be static as well. We just somehow need
169.254.169.254 accessible, but qemu network guest forwards should take
care of that.

For Azure I'm not sure, as the agent generates a RSA key and somehow
uses it to communicate with the old metadata service; I have not jet
worked with the new service used on Azure Stack.

Any thoughts?

Are there more images of the stuff you wrote on the board at the last
cloud sprint? I only know about
https://photos.app.goo.gl/5066UjdxGNiH6Qur2

Regards,
Bastian

--
Where there's no emotion, there's no motive for violence.
-- Spock, "Dagger of the Mind", stardate 2715.1

Thomas Lange

2018-03-24 18:46:44 UTC

I've uploaded two pictures of the whiteboard to this URL
https://informatik.uni-koeln.de/public/lange/debian-cloud-2017/

--
regards Thomas

Tomasz Rybak

2018-03-25 20:15:49 UTC

Post by Bastian Blank
- Tests for images.
I'm not sure if scheduled builds should perform detail tests on all
platforms, or if this should be restricted to releases and
explicit
triggers.

I thought about tests again. I think we should do a test with a mocked
metadata service for each build, so we know that the image is able to
provision. This should take no more than two minutes.

What do you mean by that?
I've recently been thinking about how to test our images,
and how to integrate test we've been working on during
the spring; haven't come to any solutions yet, though.

But it would be good place to start.
Should I create new thread, so we can discuss it?

Best regards.

--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C

Bastian Blank

2018-03-26 19:12:48 UTC

Post by Tomasz Rybak
What do you mean by that?
I've recently been thinking about how to test our images,
and how to integrate test we've been working on during
the spring; haven't come to any solutions yet, though.

Okay, let's outline my solution, which I didn't test yet. For the
official builds all this stuff runs in kvm instances with nested-vmx on
casulana.

- Each test runs in a disposable Docker environment.

This means we have a completely scripted test environment. This means
less chance that documentation and implementation gets out of sync.
Or just work against someone changing stuff by hand.

- It sets up a complete network:
- a bridge,
- some NAT for pretty controlled access to the outside,
- dns, dhcpv4 and dhcpv6 server (lets see if dnsmasq works properly),
- a mock metadata server for the tested plattform.

- It starts the image with kvm.

- It can then run the rest of the tests.

Bastian

--
Ahead warp factor one, Mr. Sulu.

Tomasz Rybak

2018-03-27 22:09:40 UTC

Okay, let's outline my solution, which I didn't test yet. For the
official builds all this stuff runs in kvm instances with nested-vmx on
casulana.
- Each test runs in a disposable Docker environment.
This means we have a completely scripted test environment. This means
less chance that documentation and implementation gets out of sync.
Or just work against someone changing stuff by hand.
- a bridge,
- some NAT for pretty controlled access to the outside,
- dns, dhcpv4 and dhcpv6 server (lets see if dnsmasq works
properly),
- a mock metadata server for the tested plattform.
- It starts the image with kvm.
- It can then run the rest of the tests.

Seems reasonable.

OK, it's time to start thinking about integrating
our tests. It looks like now we have 3 testing solutions:
1. yours
2. one by Thomas (zigo)
3. one that we were working on during last sprint:
https://salsa.debian.org/cloud-team/debian-cloud-tests

Have you written some code already? If so, where is it?

I'll try to gather all the information about our existing
testing solutions and send some summary to the list.

Best regards.

--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C

Bastian Blank

2018-03-28 09:39:48 UTC

Post by Tomasz Rybak
OK, it's time to start thinking about integrating
1. yours

Mine is not so much a test solution but a test environment where you can
run a vm with an arbitrary image.

Post by Tomasz Rybak
Have you written some code already? If so, where is it?

I haven't written any code for the test setup yet.

Bastian

--
Intuition, however illogical, is recognized as a command prerogative.
-- Kirk, "Obsession", stardate 3620.7

Tomasz Rybak

2018-04-02 20:16:37 UTC

Post by Tomasz Rybak
OK, it's time to start thinking about integrating
1. yours

Mine is not so much a test solution but a test environment where you can
run a vm with an arbitrary image.

Our test harness (on which we worked in Seattle) starts
VM in the cloud, checks it from the outside (open ports, etc.),
opens SSH connection, copies code to VM, and runs it (and thus
additional tests).

From your description it looks like we have some common areas
there.

Post by Tomasz Rybak
Have you written some code already? If so, where is it?

I haven't written any code for the test setup yet.

Especially because of this - do you want to create your own solution,
or do you want to try to use and maybe extend our existing code?
Our tests (and management code) are written in Python.

Best regards.

--
Tomasz Rybak, Debian Developer
GPG: A565 CE64 F866 A258 4DDC F9C7 ECB7 3E37 E887 AA8C

Bastian Blank

2018-08-02 14:25:52 UTC

Moin

Sorry, but it took a bit longer than I anticipated for an update.

I finally setup a VM on casulana, and it builds the stuff in our
repository:

https://salsa.debian.org/cloud-team/fai-cloud-images/pipelines/15397

Post by Bastian Blank
Each build runs in a scratch Docker environment via a special
configured GitLab Runner. The builds need access to loop devices,
which is by default not allowed. (Yes, I'm aware that neither Docker,
nor GitLab Runner, have suitable versions in Debian Stretch.)

Right now this VM runs, but it got setup by hand. This obviously needs
to change. However, during testing, I found some problems with it:

The main problem is that all the builds need to run with pretty high
privileges and access shared ressources (mostly the loop devices to
mount a filesystem with). In case of errors or if someone just cancels
the builds, these shared ressources are not cleaned up and will break
the builder sooner or later.

Also this VM will needs at least 32GB of RAM to run a sufficient number
of parallel builds (casulana got 88 cpu cores, we gotta use them) and it
can't shrink between builds.

So I intend to change that setup a bit. It will look more like the
general purpose CI runner for privileged builds I setup for Salsa in the
last weeks. It will use docker-machine to setup and operate scratch VM.
There exists an experimental QEMU driver, which will need some changes
to use CoreOS for the VM. But it uses the user network to communicate
with the VM, so does not need any privileges.

Regards,
Bastian

--
Time is fluid ... like a river with currents, eddies, backwash.
-- Spock, "The City on the Edge of Forever", stardate 3134.0

Thomas Lange

2018-08-02 15:11:30 UTC

Hi Bastian,

I'm not sure if you know that we have this new repository:
https://salsa.debian.org/cloud-team/qemu-vm

This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.

I want to avoid that multiple people work on the same thing but do not
know what others are doing. Or do I misinterpret this wrong and yours
is needed for some other things? I'm confused.

--
regards Thomas

Bastian Blank

2018-08-02 16:09:08 UTC

Post by Thomas Lange
This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.

Well, I don't want another single use solution. How will you hook this
into the Gitlab CI?

Bastian

--
Killing is stupid; useless!
-- McCoy, "A Private Little War", stardate 4211.8

Bastian Blank

2018-08-02 16:33:26 UTC

Post by Thomas Lange
This is what luca set up for openstack to create new clean VMs. We may
want to use this for calling fai-diskimage inside the VMs when
building the images.

Well, I don't want another single use solution. How will you hook this
into the Gitlab CI?

Sorry, this sounded a bit harsh.

My project tries to use a lot of existing services, and it works pretty
well.

Of cause, if you want to build the VM management by hand, you can do it.
But will it integrate or do you need to replace the whole rest of the
stack as well?

Bastian

--
Another Armenia, Belgium ... the weak innocents who always seem to be
located on a natural invasion route.
-- Kirk, "Errand of Mercy", stardate 3198.4

Thomas Lange

2018-08-08 09:47:48 UTC

Post by Bastian Blank
Of cause, if you want to build the VM management by hand, you can do it.
But will it integrate or do you need to replace the whole rest of the
stack as well?

I do not know how we want to build this whole CI stuff. I'm only the
FAI expert. But last year we worked on this build process a little bit
(there was this big picture on the whiteboard) and now something
different showed up. That was the part that confuses me. I guess this
will be a big topic during the next cloud sprint.

I wonder why you worry about 32GB RAM? Casulana has a lot of RAM and
32GB should not cause any trouble to it. Was this the main reason to
build something using docker and CoreOS?

--
regards Thomas

Bastian Blank

2018-08-08 21:14:38 UTC

Post by Thomas Lange
I do not know how we want to build this whole CI stuff. I'm only the
FAI expert. But last year we worked on this build process a little bit
(there was this big picture on the whiteboard) and now something
different showed up.

I saw that picture, but it contained no implementation specific
information. Also Salsa did not exist last year, but now it does.

Post by Thomas Lange
I wonder why you worry about 32GB RAM? Casulana has a lot of RAM and
32GB should not cause any trouble to it.

It's just not nice to do it this way. And it limits the scalability.

Post by Thomas Lange
Was this the main reason to
build something using docker and CoreOS?

No, the main reason it isolation. The builds take some global
resources, loop devices, and may not return them in case of some errors.

Bastian

--
Virtue is a relative term.
-- Spock, "Friday's Child", stardate 3499.1

Jimmy Kaplowitz

2018-08-08 21:47:06 UTC

Post by Thomas Lange
Was this the main reason to
build something using docker and CoreOS?

No, the main reason it isolation. The builds take some global
resources, loop devices, and may not return them in case of some errors.

Google builds their official GCE Debian images inside transient GCE
instances, solely for isolation purposes (they use the Debian cloud team
build tools, probably still bootstrap-vz until we get FAI sufficiently
working). To be clear, nothing about that needs to be in GCE, except for
a few implementation details of their particular build harness. Regular
VMs work fine.

I support the goal of isolation, but transient VMs can serve the same
purpose in a workflow that's more easily portable between casulana,
GitLab CI (I presume?), a personal dev laptop, and anywhere else one
might want to reproduce the flow. Which seems like a win for maximizing
how easy it is for people to hack on this - and also for companies like
Google to converge further with us on tooling.

- Jimmy Kaplowitz
***@debian.org

Bastian Blank

2018-08-09 14:11:19 UTC

Post by Jimmy Kaplowitz
I support the goal of isolation, but transient VMs can serve the same
purpose

This setup uses transient VM to do isolation. Isolation is the goal,
transient VM are the way to do it.

On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.

Post by Jimmy Kaplowitz
purpose in a workflow that's more easily portable between casulana,
GitLab CI (I presume?), a personal dev laptop, and anywhere else one
might want to reproduce the flow.

A user will have the following ways to build it:
- Push into the cloud-team repo and the builder on casulana will pick it
up.
- Push into a private repo and the shared builder will pick it up. This
does not yet work due to a missing config option and tags on the
builders.
- Use "gitlab-runner exec docker --docker-privileged $job" to run it
from the checkout on her own Docker instance.
- Use "gitlab-runner exec shell $job" or "gitlab-runner exec ssh $job"
to do the same either on the local machine or another one.
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.

All of that need documentation, including information how to setup a
capable runner. I'm currently trying to convince gitlab.com to change
their config a bit to make the build working without changes.

Regards
Bastian

--
The sooner our happiness together begins, the longer it will last.
-- Miramanee, "The Paradise Syndrome", stardate 4842.6

Jimmy Kaplowitz

2018-08-10 22:44:50 UTC

Post by Bastian Blank
On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.

Ideally the workflow would work in any VM host, whether that's qemu,
GCE, GitLab CI, or AWS. Maybe with some platform-specific details in the
build harness, such as to provide relevant bits of configuration, but
that should just set up and invoke the same build flow regardless of
platform.

I notice that a lot of your instructions refer to Docker, though. Are
you talking about running Docker inside transient VMs or using it
instead of transient VMs?

Post by Bastian Blank
- Push into the cloud-team repo and the builder on casulana will pick it
up.
- Push into a private repo and the shared builder will pick it up. This
does not yet work due to a missing config option and tags on the
builders.
- Use "gitlab-runner exec docker --docker-privileged $job" to run it
from the checkout on her own Docker instance.
- Use "gitlab-runner exec shell $job" or "gitlab-runner exec ssh $job"
to do the same either on the local machine or another one.
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.
All of that need documentation, including information how to setup a
capable runner. I'm currently trying to convince gitlab.com to change
their config a bit to make the build working without changes.

My hope is that nobody would need to know or use GitLab (except that
they would git clone our code from Salsa), nor have any write access to
Debian infrastructure whatsoever, in order to reproduce our builds.
Independent verifiability is good, and requiring installation of gitlab
or knowledge of gitlab-runner seems like unnecessary complexity.

I have no problem if we choose to use Salsa and/or casulana for our
builds. For builds which are run via Salsa, I imagine gitlab-runner
would be in the mix as a way to launch the build process. That's fine.

But any Debian user with a laptop and a way to run VMs should be able to
reproduce our builds without installing GitLab or seeking an account
from anyone. (The only exception would be if a provider-specific image
has a reason to require building on their platform, but then that would
still not require a Debian-controlled account.)

- Jimmy Kaplowitz
***@debian.org

Bastian Blank

2018-08-11 13:06:53 UTC

Post by Jimmy Kaplowitz

Post by Bastian Blank
On casulana it only can run qemu directly. On GCE it would just start a
VM on the platform.

I notice that a lot of your instructions refer to Docker, though. Are
you talking about running Docker inside transient VMs or using it
instead of transient VMs?

I would not say it uses transient VM if it weren't. So it runs Docker
as runtime environment within transient VM.

Post by Jimmy Kaplowitz
My hope is that nobody would need to know or use GitLab (except that
they would git clone our code from Salsa), nor have any write access to
Debian infrastructure whatsoever, in order to reproduce our builds.

I don't know where this idea comes from. While this stack uses Gitlab
services and runs on salsa.d.o, you can always use the inner layers
without it.

Post by Jimmy Kaplowitz
Independent verifiability is good, and requiring installation of gitlab
or knowledge of gitlab-runner seems like unnecessary complexity.
But any Debian user with a laptop and a way to run VMs should be able to
reproduce our builds without installing GitLab or seeking an account
from anyone.

Post by Bastian Blank
- Use "make $job" to run it by hand from the working copy. We need to
rename stuff a bit for that.

Where does it say Gitlab in this? It is just "make".

The whole thing is setup in layers. You can remove several until you
are at the basic build stuff. You don't need to use all of them, it's
just convenient.

Right now this layers are:
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai

I hope that clears up some assumptions.

Bastian

--
If some day we are defeated, well, war has its fortunes, good and bad.
-- Commander Kor, "Errand of Mercy", stardate 3201.7

Luca Filipozzi

2018-08-12 00:51:28 UTC

Post by Bastian Blank
- gitlab: repository store, job scheduler
- gitlab-runner: job executor
- docker-machine: VM handling with docker and directly supported by
gitlab-runner
- docker: runtime environment within transient VM
- gitlab ci script: just a script
- fai class wrapper
- fai

casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?

The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent) and using appropriate parameters to identify
number of CPU/RAM/etc is available on the runner vs what a job needs.
That way, we make best use of casulana.

Thoughts?

--
Luca Filipozzi

Bastian Blank

2018-08-12 07:56:18 UTC

Hi Luca

Post by Luca Filipozzi
casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine.

Sadly the graphs created by munin are pretty useless to find the real
short term usage. They show for the CD building peak (the only large
peak on the weekly graph) a disk usage of 2kIOPS and 22 CPU.[1] This
would be very far from full load of this machine.

Post by Luca Filipozzi
One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?

You create a project and create jobs on it either by using schedules[2]
or with triggers[3].

Post by Luca Filipozzi
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent) and using appropriate parameters to identify
number of CPU/RAM/etc is available on the runner vs what a job needs.

gitlab-runner does not have any resource controls, apart from the number
of parallel jobs. In all the shared installations, resources are
controlled by hard limits per job in form of a VM.

RAM usage looks pretty neglectable, 90% are used by cache. For the CPU
and IO usage there are no usable estimates due to missing data.

Post by Luca Filipozzi
That way, we make best use of casulana.
Thoughts?

I'm not convinced that we need such strong separation, but I've never
looked at the real usage.

Regards,
Bastian

[1]: https://munin.debian.org/debian.org/casulana.debian.org/index.html
[2]: https://docs.gitlab.com/ce/api/pipeline_schedules.html
[3]: https://docs.gitlab.com/ce/api/pipeline_triggers.html

--
We Klingons believe as you do -- the sick should die. Only the strong
should live.
-- Kras, "Friday's Child", stardate 3497.2

Thomas Goirand

2018-08-29 15:07:55 UTC

casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)

What is SGE? A google non-free version of beanstalk?

Cheers,

Thomas Goirand (zigo)

Luca Filipozzi

2018-08-29 15:24:48 UTC

casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we could/should
do is turn those 'build jobs' things that can be executed by a scheduler
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)

What is SGE? A google non-free version of beanstalk?

Son of Grid Engine, the open-source fork? of Sun Grid Engine.

Packaged in Debian as 'gridengine-*'

A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.

--
Luca Filipozzi

Paul Dejean

2018-08-29 15:28:27 UTC

I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.

casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we

could/should

Post by Luca Filipozzi
do is turn those 'build jobs' things that can be executed by a

scheduler

Post by Luca Filipozzi
such as sge or gitlab-runner. Can we inject non-gitlab-originating jobs
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing jobs
into SGE (or equivalent)

What is SGE? A google non-free version of beanstalk?

Son of Grid Engine, the open-source fork? of Sun Grid Engine.
Packaged in Debian as 'gridengine-*'
A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.
--
Luca Filipozzi

Paul Dejean

2018-08-29 15:34:51 UTC

Also would like to add that by using cloud runners we circumvent a lot of
these issues such as all the shared permissions needed, the having to set
up runners by hand (assuming we make an infra as code repo) and so on.

Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating that
machine images could be built in gitlab on google cloud runners that have
nested virt support.

casulana is also used to build the CD images. Currently, the scripts
that build the CD images execute a number of 'build jobs' in parallel,
effectively monopoloizing the machine. One of the things we

could/should

Post by Luca Filipozzi
do is turn those 'build jobs' things that can be executed by a

scheduler

Post by Luca Filipozzi
such as sge or gitlab-runner. Can we inject non-gitlab-originating

jobs

Post by Luca Filipozzi
into gitlab's scheduler?
The other thing we could do is set up system-wide semaphor that both
cd-build and gitlab-runner use, but i prefer the idea of enqueuing

jobs

Post by Luca Filipozzi
into SGE (or equivalent)

What is SGE? A google non-free version of beanstalk?

Son of Grid Engine, the open-source fork? of Sun Grid Engine.
Packaged in Debian as 'gridengine-*'
A job scheduler / resource manager. More suited for cluster but still
useful for a large machine like casulana.
--
Luca Filipozzi

Luca Filipozzi

2018-08-29 15:47:40 UTC

Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).

This explains why all the build daemons are on Debian-controlled
hardware.

casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.

Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open source
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.

Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.

I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that far,
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.

Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different legal
jurisdictions, for redundancy purposes.

Colleagues, please elaborate if my explanation above is incorrect in any
way.

Cheers,

Luca

--
Luca Filipozzi

Paul Dejean

2018-08-29 16:07:24 UTC

Ok that's understandable. Question #1 who pays for this? A datacenter rack
costs money. And whoever owns the data center has physical access. The
actual computer hardware costs money not just on a one time basis either.

Where does "hardware" begin and end? Does debian need to own the rack
rather than renting it? The screws you use to mount the server? The
Ethernet cables?

There's a huge cost to maintaining this too. From my understanding there's
no mesos cluster setup right now, no kubernettes, no working openstack api.
Creating a private Debian cloud is a lot of work. Not creating a private
Debian cloud and just having a bunch of ad hoc servers is probably even
more work in the long run.

The idealogy is admirable but we need to define clearly what problem we're
trying to solve. Is it avoiding vendor lock in? If so there might be ways
to use google cloud and avoid vendor lockin. Is it trying to keep Google
from having access to our private data? If so a good first step would be
stripping access from any Google employees who might be Debian maintainers
(which would be incredibly silly).

Is it trying to avoid corporate influence? Amazon is already contributing
resources (i think might be remembering wrong) and there were plans for
Google to join in soon as was mentioned in this thread.

I'm not trying to knock idealogy, it's what makes Debian not Red Hat. All
I'm saying is that we need to define what exactly the rules and goals are
here so we know what there is to work with.

Post by Paul Dejean
I honestly don't get it. Why is casulana so necessary for building these
images going forward. What kicked off this thread was me demonstrating

that

Post by Paul Dejean
machine images could be built in gitlab on google cloud runners that have
nested virt support.

Primarily, Debian (as a community) has long-held the opinion that our
packages, our cd images, and (by extension) our cloud images should be
built on hardware that is owned and operated by Debian. VMs provided by
a third party (AWS, etc.) are only as secure as the third party
(either poor architecture or nefarious intent) or as secure as the
hypervisor (against fourth parties).
This explains why all the build daemons are on Debian-controlled
hardware.
casulana was purchased to address two needs: cd-image and cloud-image
building. The former requires significant resource; the latter not
nearly as much.
Secondarily, as you will have seen by the salsa thread relating to use
of Google storage for git lfs, there are members of the community that
would like to see Debian choose options that (a) make use of open source
software and (b) make us less rather than more reliant on the good will
of entities such as Google and AWS.
Like I said earlier in the thread: the ongoing to-and-fro regarding
using casulana for build and using FAI is not useful at this stage.
Regardless of my personal opinion, I view these as settled discussion
points based on what I saw at the 2017 Cloud Sprint and at the DC18
Cloud BoF.
I'm very appreciative of Bastian's work on getting gitlab build jobs
prepared. gitlab doesn't use gridengine; we may not need to go that far,
but we may wish to introduce some kind of semaphor between gitlab jobs
and cd-image jobs to allow all of casulana to be used by the cd-image
scripts.
Finally, while salsa is using Google storage for git lfs, the ability
for Google to tamper with the objects in git in an undetectable way is
very limited so I'm less concerned about that particular usage of a
third-party resource. I've mentioned that I would love to see several
third-party storage solutions to be employed, ideally in different legal
jurisdictions, for redundancy purposes.
Colleagues, please elaborate if my explanation above is incorrect in any
way.
Cheers,
Luca
--
Luca Filipozzi

Luca Filipozzi

2018-08-29 16:22:13 UTC