Difference between revisions of "Design Sessions"

From Xen
(Embedded and Safety: Added URL for the graphics notes)
(Redirected page to Category:Design Sessions)
(Tag: New redirect)
Line 1: Line 1:
#REDIRECT [[:Category:Design_Sessions]]
== 2018: Developer and Design Summit ==
{{TODOLeft|Session hosts, please send notes to xen-devel@ and update the wiki or send the notes to community dot manager at xenproject dot org.}}
==== Architecture ====
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00133 Reworking x86 Xen, current status and future plan] (Wei Liu, Citrix)'''
: A year ago Wei presented two projects about reworking x86 Xen. A lot of things happened since then. This session aims to give a quick update on the progress and asks stakeholders for suggestions and opinions for future development.
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00129 PCI pass-through with de-privileged QEMU] (Xin Li, Citrix)'''
: Previously, when there is PCI device being passed through, the QEMU can only run in privileged mode. This design is to let QEMU always run in de-privileged mode.
:* change to xen, mainly in libdevicemodel to add the DM-ops for passing through PCI device in xen-domid-restrict mode.
:* change to libxl, to pass PCI config fd to QEMU.
:* change to QEMU, to read configuration and avoid reading from /dev/mem directly.
:* change to toolstack, to allow QEMU read PCI info from sysfs.
:Need further discusion about reading form /dev/mem part.
:* what device / OS will perform this operation(read from /dev/mem)
:* can mmapping of /sys/bus/pci/devices/0000:XX:XX.X/resouceX replace the reading from /dev/mem.
* '''(NO NOTES) Resource mapping, PV-IOMMU and page ownership in Xen (Paul Durrant, Citrix)'''
{{InfoLeft|Comment by host: I don’t think anyone took notes at my session. I got what I needed out of it though… and it’s probably not of particular interest to anyone who was not there.}}
: The recent series to add direct resource mapping into Xen highlighted areas where the current status quo of PV domains being able to map any page assigned to them is problematic from a security PoV. There are pages that constitute a resource, which should probably be accounted to a domain without that domain having the privilege to map the resource. The current scheme does not allow for this. Thus it would be useful to discuss ideas on how we might improve the situation.
: Page ownership also creates problems with PV-IOMMU when dealing with grant mapped foreign pages.
==== Intel Specific ====
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#01592 Intel Processor Trace for Xen hypervisor design discussion] (Luwei Kang, Intel)'''
: <em style="color: red">Slides are available [https://www.slideshare.net/xen_com_mgr/xpdds18-intel-processor-trace-for-xen-hypervisor-luwei-kang-intel here]</em>
: Intel Processor Trace is a hardware feature that recording information about software execution with minimal impact to system execution. Existing hardware is unfriendly to enable Intel PT in guest because the implementation of shadow ToPA is very complex. Intel PT VMX improvements will treat PT output addresses as Guest Physical Addresses (GPAs) and translate them using EPT that serves to simplify the process of Intel PT virtualization for using by a guest software. This discussion is intended for the deep dive introduction of Intel Processor Trace and design discuss of SYSTEM mode implementation, Intel PT introspection, new qualification of Intel PT output, nested, live migration and so on.
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#01592 How to support more vCPUs in a HVM guest] (Chao Gao, Intel)'''
: To better support HPC, Intel has launched a product, code named Knight Mill, which supports up to 288 logical CPUs and a high-bandwidth on-die memory called MCDRAM. We have been working on supporting Xen to build HPC clouds. One main task is to enlarge the maximum number of vCPUs in a HVM guest to 288. Although we have sent out several versions of patches for this purpose, not all problems are revealed and discussed. In this design session, we want to discuss these problems and reach an agreement on how to deal with them.
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#01592 NVDIMM Discussion] (George Dunlap, Citrix)'''
{{TODOLeft|Update this once the new version of the NVDIMM DOC is available.}}
: Non-Volatile Dual In-line Memory Module or NVDIMM is a type of memory device that can provide persistent storage and retain data across power cycles/failures. This discussion is about the design to support NVDIMM in Xen.
: <em style="color: red">The Notes will be included in an updated version of the [https://xen.markmail.org/thread/ef6vfxvahydeq2rg NVDIMM DOC]. Slides are available [https://www.slideshare.net/xen_com_mgr/xpdds18-speculation-and-response-spectre-meltdown-xpti-and-panopticon-george-dunlap-citrix here]</em>
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#01086 SGX deep dive and SGX virtualization design discussion] (Kai Huang, Intel)'''
: Software Guard Extensions (SGX) is Intel's unique security feature which has been present in Intel's processors since Skylake generation. Existing HW/SW solutions hypervisor does not protect tenants against the cloud provider and thus the supplied operating system and hardware. Intel SGX solves this by using enclave, which is a protected portion of userspace application where the code/data cannot be accessed directly from outside by any software, including privileged ones, such as BIOS and VMM. This discussion is intended for the deep dive introduction to SGX, and the design discussion of adding SGX virtualization to Xen. We will start with SGX deep dive, and then go into SGX virtualization design, from high level design to details, such as EPC management/virtualization, CPUID handling, interaction with VMX, live migration support, etc.
==== Embedded and Safety ====
* '''[TODO Dom0less and static partitioning] (Stefano Stabellini, XILINX)'''
: Running Xen without Dom0
* '''[TODO A Strawman Plan to Make Safety Certification for Xen Easier] (Lars Kurth, Xen Project)'''
: ''The Plan''
: Hypervisors were once seen as purely cloud and server technologies, but have slowly seeped into the embedded space. This is in particular true for the Xen Project, which is being used by a number of vendors to build automotive stacks.
: However, to be successful in automotive (as well as other future market segments where Xen could be useful), the project needs to be easily certifiable. To facilitate this, we have developed a straw-man plan, which focusses on the following topics
:* Reducing code size significantly using Kconfig
:* Coding standards
:* An RTOS based Dom0, or dom0-less Hypervisor
:* Etc.
: In this session, we will share the high-level plan, with the goal to identify any collaborators and get community feedback. The session will also touch briefly on longer term challenges.
: ''Feedback received''
: We will also share feedback from others so far, such as feedback from Genivi AMM, Platform Security Summit, Linaro and others.
: ''Status Update''
: How much progress have we made
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-08/threads.html#00167 Graphic virtualization on Xen]''' (Julien Grall, Arm)
: There are an increasing interest to share the GPU between multiple domain. This is an open session to discuss on possibility to support different GPU (Mail, PowerVR,...) with Xen.
==== Working Practice, Process, ... ====
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00126 Testing/Building with Docker/GitLab] (Doug Goldstein, Rackspace)'''
: Using Docker containers to provide "official" build environments with known dependencies that can be used to build Xen and build all of its components. Using GitLab to build every commit to help catch regressions early. Looking to discuss how to best do this and the end goal with some time frames to make this happen.
: Could we automate some tests for submitted series to the ML?
* '''[TODO (Automated?) Performance Testing in Virtualization] (Dario Faggioli, Suse)'''
: Detecting performance regressions, and identifying what causes them, is particularly hard, in virtualization. In fact, what benchmarks shall be used? In what kind of VMs do we run them? How many VMs, and how large? All equally large? Same benchmarks in all VMs? Also, what do we want to measure: virtualization overhead? The impact of a change/feature, or of a particular configuration of the hypervisor, the host OS or the guest OS? Or maybe we want to compare different virtualization solutions?
: Also, with so many moving parts, automation is a must, but may also be problematic. E.g., hosts and VMs need being provisioned and benchmarks run concurrently in VMs.
: And what about comparing different runs, reaching statistical significance...
: This session goes over these challenges, explains what is being done, both within SUSE and in the community, and tries to envision how to improve things.
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00166 Process changes: is the 6 monthly release Cadence too short, Security Process, ...] (Lars Kurth, Xen Project)'''
: Release Cadence: 2 years ago, we moved to a 6 monthly release cadence. The idea was to help companies getting features into Xen in a more predictable way. This appears not to have worked. At the same time, the number of releases is creating problems for the security team and some downstreams. I wanted to collect views to kick-start an e-mail discussion.
: Security Process: See https://lists.xenproject.org/archives/html/xen-devel/2018-05/msg01127.html
: Other changes that may be worth highlighting ...
==== Performance ====
* '''[https://lists.xenproject.org/archives/html/xen-devel/2018-07/threads.html#00017 XenWatch Multithreading Design Session] (Dongli Zhang, Oracle)'''
: The Xen domU create/destroy and device hotplug rely on xenwatch kernel thread to run xenwatch event callback function for each subscribed xenstore node update. Any event callback function hang would stall the only single xenwatch thread and forbid further domU create/destroy or device hotplug. This talk presents how Xenwatch Multithreading can address the xenwatch stall issue. In addition to the default xenwatch thread, the dom0 will create a per-domU kernel thread for each domU to handle their own xenwatch event. Therefore, domU create/destroy or device hotplug are still allowed even when a specific per-domU xenwatch thread is stalled. This talk first discusses the limitation in single-threaded xenwatch design with some case studies, then explains the basic knowledge on paravirtual driver, and finally presents the challenge, design and implementation of xenwatch multithreading.
==== Security ====
* '''[TODO Silo mode for extra defence in depth] (Xin Li, Citrix)'''
: workloads, with an expectation of no cross communication. Therefore, the default in Xen of allowing arbitrary communication is an unnecessary set of attack surfaces. We'd like to support, by default, rather more restrictions in use cases like this.
* '''[TODO Panopticon: See no secrets, leak no secrets] (George Dunlap, Citrix)'''
: This is a follow-on from the Spectre/Meltdown issues, where it would be a very good idea to get rid of the Directmap/etc, and we should think about doing per-domain heaps/etc. to reduce the quantity of "non-relevant" data mapped in context, to reduce the risk of data leakage.
* '''[TODO What is OpenXT and the Xen Security Community Doing - this was primarily about measured boot and Win10 support] (Lars Kurth, Xen Project & Rich Persaud, OpenXT)'''
: Paul Durrant and Lars Kurth were at https://www.platformsecuritysummit.com/ this year. Lars is happy to walk those who are interested over the highlights, expected contributions, etc. from the event and answer questions that you may have.
==== Other ====
* '''[TODO USB pass-through on Xenserver] (Xin Li, Citrix)'''
: Previously user can only passthrough the whole USB controller(as PCI device) via cmdline. This feature will allow user pass through different physical USB devices to different VMs. Current solution is based on QEMU. To support all guest OS (both HVM and PV), there's a solution alternative to implement PVUSB. To use PVUSB, we need usbfront in guest OS (Windows and Linux), and usbback in dom0. Previously there's ever PVUSB frontend/backend drivers in SLES11, but they were removed then. So now there's no Linux kernel support for PVUSB (neither usbfront nor usbback). There's no Windows usbfront for PVUSB either. We'd like to raise this topic and discuss:
:* compare our phase 1 solution - QEMU based USB passthrough and PVUSB;
:* the issues of the ever existing PVUSB solution (SLES11);
:* the plan to implement PVUSB and address the issues above.
* '''[TODO From Hobbyist to Maintainer, Why and How] (Wei Liu, Citrix)'''
: Open source projects like Xen and Linux kernel have become the corner stones of our modern infrastructure. In this session Wei is going to explain why one, as a software engineer, would want to invest in building up technical competence and soft skill to ultimately become a maintainer in those established projects, how this can help personal career goal and business development, and finally what is involved in getting maintainership.
* '''[TODO Unikraft: Design and Use Cases] (Florian Schmidt, NEC)'''
: We can discuss the architecture of unikraft, and collect suggestions from the community. Let's also collect use cases that people use Mini-OS for, to see what functionality is still needed to eventually replace Mini-OS with unikraft.
== 2017: Developer and Design Summit ==
=== Notes ===
'''Intel sessions'''
* [http://markmail.org/message/ibzznuc6c6suow3o Notes from the 5-level-paging session ]
'''ARM Sessions'''
* [http://markmail.org/message/ydwaxnbn6f76inhg Notes from PCI Passthrough design discussion at Xen Summit]
'''Coprocessor Sharing and Stubdoms'''
* [http://markmail.org/message/qz3mtbzypohc7vh2 Notes on stubdoms and latency on ARM]
* [http://markmail.org/message/t7ipyk44gbjvxrqv Shared coprocessor framework followup]
* [http://markmail.org/message/mxqriltr7dt5yr3p SCF configuration followup]
'''PVH (v2)'''
* [http://markmail.org/message/ryvksozgrfefiadh Notes from the PVH toolstack interface session]
* [http://markmail.org/message/v3ks2hkfodip777c Notes from the PVH performance session]
'''Release and Build tools'''
* [http://markmail.org/message/7e2mdpimvrmsppq5 Notes Design Session: Making Releases Lessons Learned: Improving Our Release Process and Tooling]
* [http://markmail.org/message/kyibfdwfv7plmzuz Notes Design Session: Testing & CI Process and Workflow Improvements, x86/ARM/Embedded Testing, etc. - Does what we do today work?]
* [http://markmail.org/message/y6pzru76ufizz2rv Build tools follow up]
* [http://markmail.org/message/4as7zfl2tfqhpdyo A document for Xen release management ]
* [http://markmail.org/message/lvig3u2gcqt3nwgs A document for Xen release management, v2]
'''Security, Safety'''
* [http://markmail.org/message/h2bjcu47mi5ozxgv Notes of Design Session: Xen Certification in Automotive Industrial]
* [http://markmail.org/message/zxtisdcbh6k7mdr5 Notes for Design Session: Loose ends for becoming a CNA (CVE Numbering Authorities) and other Security Team Operational Questions]
* [http://markmail.org/message/37annnvm7wwygr4j Notes from Design Summit Hypervisor Fuzzing Session ]
* [http://markmail.org/message/lvoapfwg66m7agp7 Scripts to check XSA patch-level on xen trees (xen.git, qemu-xen.git & qemu-xen-traditional.git)]
* [http://markmail.org/message/ufgy7p3zk7umnuik Notes from Design Session: Solving Community Problems: Patch Volume vs Review Bandwidth, Community Meetings ... and other problems]
=== Sessions Highlighted on the Wiki ===
|Project=Graphics Virtualization
|Date=July 11 2017
|Contact=(OpenXT) Rich Persaud, Christopher Clark
|Desc=GPU virtualization is used in Server VDI, Automotive, Desktops and Laptops. GPU vendors have different approaches to virtualization of 3D graphics (NVIDIA GRID, AMD MxGPU, Intel GVT, Imagination PowerVR OmniShield), while software-based graphics virtualization may not support modern video and user interface animations. Gaming is one of the few growth areas for PCs and CAD can be done via remote desktop. What are current best practices for Xen users and developers to achieve high-performance 3D graphics on Windows, Linux and Android? Is KVM better than Xen for graphics virtualization?
* Intel GVT-d and GVT-G with local multi-monitor display
* Zero-copy display of guest framebuffers: Intel GVT, ARM with and without IOMMU
** Baboval 2013: http://events.linuxfoundation.org/sites/events/files/slides/Zero-Copy%20Display%20of%20Guest%20Framebuffers%20using%20GEM.pdf
* Display management: GVT, PV-displ, Qubes compositor, OpenXT surfman
* HID virtualization (secure input, seamless mouse-display switching, multi-touch)
SCHEDULE: http://sched.co/AjEV
{{comment| Feel free to make suggestions here}}<br>
{{vote|And whether you intend to attend}}
|Project=Xen Toolstacks for Server and Edge Use Cases
|Date=July 11 2017
|Contact=(OpenXT) Rich Persaud, Christopher Clark, Chris Rogers
|Desc=Many Xen toolstacks have come and gone. Libxenlight was created to provide a common base layer upon which higher-level toolstacks could be built. What is the roadmap for libxenlight to meet the needs of servers, local/enterprise managed clients, OTA update for embedded and mobile devices, unikernels, containers and automated testing? Can we reduce duplication among libvirt, xapi (Ocaml), xenrt (Python) and OpenXT (Haskell) toolstacks? Can Xen management tools compete with DevOps expectations set by the fast-moving container ecosystem?
* LibXL
** configurable build (equivalent to hypervisor Kconfig)
** error handling: map error messages to numeric codes
** Configuration file for stub domains: Mini-OS, Linux (GPU/NIC PT, CD), rumpkernel
** State management: multiple LibXL clients per host
* CoreOS rkt and Xen
* Toolstack Service VMs
* Xenstore isolation: options between 1/host and 1/VM?
SCHEDULE: http://sched.co/AjHv
{{comment| Feel free to make suggestions here}}<br>
{{vote|And whether you intend to attend}}
|Project=Testing Server and Edge Hypervisors
|Date=July 11 2017
|Contact=(OpenXT) Rich Persaud, Christopher Clark
|Desc=Virtualization increasingly depends on hardware support, while hardware diversity continues to increase. At present, common feature configurations are tested and given first-class support. Other configurations imply expert mode and private testing. Derivative projects also carry patches that may not be acceptable to upstream Xen, but are common to edge (client, embedded) use cases. Can downstream projects contribute test capacity for non-server configurations of Xen?
These test cases are relevant to OpenXT:
* Xen feature subsets (Kconfig)
* GPU passthrough/virtualization with local display: Linux, Windows (USB video capture)
* Measured Launch (Intel TXT, AMD SVM, TPM 1.2, TPM 2.0)
* Inter-VM communication: libvchan, V4V
* Stub domains: Mini-OS, Linux
* Driver domains: network, USB
SCHEDULE: http://sched.co/AjGk
{{comment| Feel free to make suggestions here}}<br>
{{vote|And whether you intend to attend}}
|Project=Design Session: Loose ends for becoming a CNA (CVE Numbering Authorities) and other Security Team Operational Questions -
|Date=July 13 2017
Contact=Lars Kurth, Ian Jackson
|Desc=The Xen Project has in-principle agreement to become a CVE Numbering Authority. However to do this, we need to define the scope of the CNA. A number of have worked on this, but we need some community inout.
'''Consolidate Security Coverage Documents'''
Consolidate security coverage documents where possible (we have a proposal). Specifically
* Review the proposal (currently in a [https://docs.google.com/document/d/17LiK-C3oBFZNpxeihXxkM2Bagn7A2L3Dv1u1ZGZe_TQ/edit google doc])
* Review the scope (currently in a [https://docs.google.com/spreadsheets/d/1wLb37mbxN715rlYD8eKV8htgatDI41noOmQ1H428QX4/edit#gid=0 google doc]) - this may involve clarifying the supported status of some components
* Once we have agreement, we basically just need to document the outcome, publish it and get the process started.
'''Other Operational Issues'''
* Automated checking of XSAs against releases (see https://xenbits.xenproject.org/gitweb/?p=people/larsk/xen-release-scripts.git;a=summary)
* Usage of RT and issues that has caused => does it work, what to change?
'''Possible/Proposed Process Changes?'''
* Bundling of issues / once every other week or monthly XSA publication?
* Include maintainers on pre-disclosure when affected and not on security team
* xen-devel thread at http://markmail.org/message/zxtisdcbh6k7mdr5
* SCHEDULE at http://sched.co/AjHl
{{comment| Feel free to make suggestions here}}<br>
{{vote|And whether you intend to attend}}
== Archive ==
See [[Hackathon_and_Dev_Meetings]]

Latest revision as of 11:02, 9 August 2019