Skip to content

Rewrite Dockerfile into multi-stage build, minor updates to support that change#1362

Open
rtizzy wants to merge 18 commits into
oraios:mainfrom
rtizzy:dockerfilesanity
Open

Rewrite Dockerfile into multi-stage build, minor updates to support that change#1362
rtizzy wants to merge 18 commits into
oraios:mainfrom
rtizzy:dockerfilesanity

Conversation

@rtizzy
Copy link
Copy Markdown

@rtizzy rtizzy commented Apr 16, 2026

The previous Docker setup is a bit confusing but I tried to infer what was intended.

Changes:

  1. Ensures a proper ENTRYPOINT and CMD are so the MCP server starts successfully
  2. Utilizes a multi-stage build so the version pushed to the registry is a slim image (~275 MB) that can have various LSPs and tooling layered on top as needed.
  3. Includes minor updates to docs and ensures the push workflow leverages the proper target (prod)

The idea here is that someone who wishes to run this MCP inside of Docker can base off the pushed serena image and layer on their necessary dependencies.

Still a bit more work to do here and I'm currently experimenting a bit locally.

@MischaPanch
Copy link
Copy Markdown
Member

Thanks, yes, it would be nice to improve this. We haven't put much attention to the docker topic or to optimizing the image. Pls ping me when you are done on your side, I'll then review.

Pinging @tirthpatel90 who is also working on improving Serena's docker issues

@tirthpatel90
Copy link
Copy Markdown

Awesome work @rtizzy! Just a heads-up, I'm currently working on Dockerfile.maximal in #1339 specifically to bake in all the heavy language dependencies (like R, Julia, OCaml) to accelerate the CI pipelines. Looking forward to seeing both our Docker optimizations in the next release!

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented Apr 22, 2026

@tirthpatel90

Thanks. A bit of constructive criticism:

It may make sense to step back and look at the entire CI/CD pipeline instead of just the Dockerfile for a few reasons.

The current multi-OS test suite is running outside Docker to test how it works on that exact OS for someone running outside Docker, which is still the standard setup. At minimum, you will still need the Windows test to exist outside Docker.

I haven't had a chance to dig fully into the tests yet but if all the tests do not require the full tool chain (My bet is they don't), what likely makes sense is something like:

  1. Segment the tests by toolchain
  2. Parallelize the CI/CD workflows across both OS (if desired) and by toolchain.

That means instead of needing one fat image, you can have a base optimized image (such as the one I'm working on here), base further images off of that if desired, install only the necessary toolchain, and run many of the tests in parallel.

That optimization will work with or without Docker and should result in a significant speedup.

I intended to perform that work after finishing this PR but feel free to pick it up.

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented Apr 22, 2026

@MischaPanch

Hope you're well.

Feel free to give this a once over and let me know if there is anything I might have missed.

I did have a question to make sure I'm understanding something correctly:

I'm pretty sure the architecture is that Serena MCP can only have one project activated at a time. I.E A single MCP server can only have one project activated and in use.

Is that correct?

@rtizzy rtizzy force-pushed the dockerfilesanity branch 2 times, most recently from d190159 to a95532b Compare April 22, 2026 10:06
@tirthpatel90
Copy link
Copy Markdown

Hi @rtizzy,

Thanks for the constructive feedback! You make an excellent architectural point. Segmenting the tests by toolchain and running them in parallel matrix jobs is definitely a cleaner, more scalable CI/CD practice compared to maintaining a monolithic image.

I would be happy to pivot and pick up the CI workflow segmentation to work alongside your base image.

@MischaPanch, what are your thoughts on this direction? If you prefer this parallelized approach over the Phase 1 maximal image, I can hold off on #1339 and start drafting the matrix workflows to run the segmented tests using rlzzy's optimized image. Let me know how you'd like to proceed!

@MischaPanch
Copy link
Copy Markdown
Member

Hi @rtizzy . Thanks for the contribution and the discussion!

I'm pretty sure the architecture is that Serena MCP can only have one project activated at a time. I.E A single MCP server can only have one project activated and in use.

Yes, that's correct. In this PR you let the entrypoint use http mode for the MCP server - is that needed? Can't we use stdio and somehow forward the stream? It would be easier for the users

@MischaPanch
Copy link
Copy Markdown
Member

@tirthpatel90 yes, if you are up to it, the parallelized approach would be far superior. Even though we would use more compute (due to multiple container startups) and the complexity of CI would become much larger, it would have the huge benefit of not having to wait 40 minutes each time.

A necessary requirement is that in the parallelized approach we don't have to change the GH workflows each time language support is added (although we should be able to do that when desired). Thus, it should be something like batch1, batch2, ... batchN, last_batch_executing_everything_else. Do you think that's feasible?

@tirthpatel90
Copy link
Copy Markdown

Hi @MischaPanch,

Yes, that is absolutely feasible and a brilliant way to future-proof the CI!

To satisfy the requirement of not modifying the GH workflows for every new language, we can dynamically batch the test discovery. For instance, we can explicitly assign the heaviest toolchains (like C++, Go, Rust, Java) into their own isolated parallel matrix jobs (batch1, batch2, etc.). For the final batch (last_batch_executing_everything_else), we can programmatically run pytest on all remaining tests that were NOT explicitly included in the previous batches.

This ensures that any newly added language tests automatically fall into the "catch-all" bucket without requiring manual YAML updates, keeping the maintenance overhead strictly at zero.

Since we are officially pivoting to this superior architecture, I will close my maximal image PR (#1339) to keep the repo clean. I'll start drafting the matrix workflows to run the segmented batches using rtizzy's leaner image and will open a fresh PR for this soon.

I'll get to work on this!

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented Apr 27, 2026

Hi @rtizzy . Thanks for the contribution and the discussion!

I'm pretty sure the architecture is that Serena MCP can only have one project activated at a time. I.E A single MCP server can only have one project activated and in use.

Yes, that's correct. In this PR you let the entrypoint use http mode for the MCP server - is that needed? Can't we use stdio and somehow forward the stream? It would be easier for the users

@MischaPanch

Thanks for the details. Give me a bit and I'll test things out with that approach.

stdio seems to fit the use case of Serena in VSCode much much better and completely circumvents a problem I was reasoning about.

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented May 2, 2026

@MischaPanch

Alright, I've tested this out a bit.

I have a feeling more doc updates are prudent.

I was testing this against a Continue based setup in VSCode (With/without dev containers).

The most portable thing I could come up with that worked there was something like this:

name: Serena
version: 0.0.1
schema: v1
mcpServers:
  - name: serena
    command: docker
    cwd: ${{ secrets.HOST_REPO_PATH }} 
    type: stdio
    args:
    - run
    - --rm
    - -i
    - -v
    - ./:/workspace
    - serena-infra # This is a container built with the necessary deps for my own repo.
# .continue/.env in project 
HOST_REPO_PATH="ABSOLUTE_PATH_TO_REPO_ON_HOST"

If that should be documented in the Serena docs itself is another question as it's pretty tooling specific.

@rtizzy rtizzy force-pushed the dockerfilesanity branch from 39c788a to 1c692c0 Compare May 2, 2026 13:28
Copy link
Copy Markdown
Member

@MischaPanch MischaPanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rtizzy . I tried to leave some comments but seems like github is having problems.

I didn't have time to do a deep dive yet, but some things seem strange to me, maybe you could briefly explain.

  1. what's the role of target prod vs dev here?
  2. why are there multiple entrypoints? And why are they stdio, shouldn't it be http in the end?
  3. Why the override in compose-dev.yaml? It only differs in the target
  4. Shouldn't the default policy be pull? Do you want the user to always build?
  5. The ansible and opentofu examples are rather niche, how about using rust or java as example?

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented May 15, 2026

@MischaPanch

No problem.

what's the role of target prod vs dev here?

The prod target is what is eventually pushed into Github. It's what end users would use.

dev is intended for development of Serena itself.

why are there multiple entrypoints?

Not strictly required and can be consolidated into the base if desired.

And why are they stdio, shouldn't it be http in the end?

It was originally HTTP but you requested it to be changed to stdio in an earlier comment, which likely makes more sense considering the architecture of Serena.

Why the override in compose-dev.yaml? It only differs in the target

This likely makes more sense once you consider the two workflows:

  1. Build and push for other users
  2. Develop serena itself.

The compose-dev example inside the repo is for people working on Serena itself.

Shouldn't the default policy be pull?

No.

Do you want the user to always build?

Yes.

Here is the way this works:

  1. The serena project pushes a base image to Dockerhub
  2. An end user creates a Dockerfile and references the Dockerhub image with FROM
  3. They then layer on any necessary dependencies and build this image
  4. This new image is how they use the Serena MCP

A user can push that into their own registry but that use case is probably unlikely for most people.

The ansible and opentofu examples are rather niche, how about using rust or java as example?

That'd be out of scope for me.

The point of the example is to show "We are providing a base image, layer on what you need in a Dockerfile. Here is a best practice example."

@rtizzy
Copy link
Copy Markdown
Author

rtizzy commented May 15, 2026

As a side note:

Useful to know what the "happy path" is intended to be.

I'd imagine the vast majority of users will end up leveraging a stdio based setup.

The "Single project" approach of Serena isn't particularly well suited to HTTP usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants