2020-03-02 08:19 pm

Week 13: The end of the internship

This would be my last post about the internship.

I enjoyed everything I'm learning from this internship, and I believe the skills I learned would help me greatly in my career as a software engineer. I wish the internship was longer and covered more topics, as sometimes it felt very challenging, like going from 0 to 100, as it would be better to have more prerequisites before getting into it, but I tried to manage it the best I could.

I think I had to read and study the documentation more than once to understand it in the context of the project properly. It wasn't easy; it wasn't simple. I'm able to create litmus tests and to understand the usage of synchronization primitives in the code. I worked with spinlocks and memory barriers. However, I believe 3 months aren't enough for the whole ride of learning about the complexity of the LKMM, the CAT language, synchronization primitives, and memory barriers.

If you are interested in a project related to litmus tests, I absolutely recommend you that before you apply, try to get familiarized with Kernel development (reading and writing code), about setting up virtual machines and running custom kernels in them (i.e., QEMU), and most importantly, about the Linux Kernel Memory Model. I believe this would help you get around the hurdles of initial obstacles and technical issues when running code that is prone to fail, so you'll get more time to dedicate to the things you are interested in.

Today, I feel I learned A LOT from everything I read and did and still need to learn more. There's no way I would've done any of this by myself, so I much appreciated every bit of information I got.

I think an organization like Outreachy is one of the most amazing ideas I've seen. I wish it gets bigger, more accessible, and even more diverse.
2020-03-01 05:04 pm

Week 9: Career goals

This post is a bit late (again). However, unfortunately, I've been dealing with some health issues, so I haven't been able to write blog posts so I can dedicate all the required time to the internship.

Career goals

Now that I went through the process of contributing with patches to the Linux Kernel, reading the Linux Kernel code, and these months working on the litmus tests project, I believe my overall knowledge about Linux improved and I feel more confident.

I improved my skills related to reading Kernel code, Git, LKMM, Herd tools, Kernel compilation, synchronization primitives, and memory barriers. I believe that I still need to learn more to get a job directly related to the Linux Kernel; however, this experience taught me there isn't a unique path to get to my goal, so I'm considering other options than the topics covered by the internship. This is the reason why I think I'll apply to internships and jobs that match my skills. I would also love to volunteer, in my free time, to open source projects that need some help.

One of the skills I want to learn when this internship ends, is Kernel debugging, as I've found obstacles related to sudden freezes and apparent compatibility issues, which were hard to fix. I'll dedicate all the possible time to keep my studies, as these 3 months feel too short!

About the type of job, I prefer a remote. I love the freedom that a remote job provides you, but I'm also open to considering relocation, as I think it offers an excellent opportunity to learn about different cultures and would help me grow personally and professionally.

My language skills: I'm a native Spanish speaker, my English level is B2, and French is A2.
2020-01-27 02:11 pm
Entry tags:

Update about the project's progress and timeline (Week 7 blog)

Time goes really fast! This is the post for week 7, so I'm sending it a bit late...

 

As I posted in the posts before this, I've been spending a lot of time reviewing Linux Kernel code that uses synchronization primitives and memory barriers, and this is because it's necessary to get familiarized with their implementation and analyze how they would behave in a SMP system. Now, for the current month, I'm studying spinlocks and analyzing litmus tests that are part of Paul E. McKenney github repo, one of the maintainers of the LKMM that is written in the CAT language.

 

This project isn't as simple as I initially thought, as there are very complex tests that include multiple non-intuitive usages of spinlocks. These tests were created to prove the properties of these synchronization primitives, and there's also Linux Kernel that was simplified into litmus tests. Running and experimenting with memory barriers and forcing certain effects helped me simulate unusual conditions that helped me further.

 

In addition to this, I studied the CAT language to better understand the axioms that define the simulations provided LKMM. The publication is "Modelling of Architectures: Choose your own adventure in herding cats" by Jade Alglave, and the current URL for the tutorial can is available here as the PDF link is down. I found it by a not-so-quick Google search, but after an email, Jade also kindly pointed me to the same version I found and the modern version that is in here :)

 

Now, about the timeline, I'm honestly finding this process a bit more complex than I initially thought, even if I didn't expect to become an expert in the LKMM or the creation of litmus tests. I talked with my mentor about my performance for the first month, and we both agreed that it wasn't necessary to change the timeline. I probably will need to address this situation again given that I feel that when I'm working on something, other things steps in to help me understand better!

 

My plan for the remaining time of the internship is to try to have a clearer process to achieve the proposed goals, as sometimes I feel that this topic is complex and needs considerable time to familiarize with it and get a proper understanding of it.


2020-01-19 11:24 pm

The project I'm working on: Litmus tests

In this post, I'll make a brief description of the project, its purpose, and the tasks required.


First, why work on the Linux Kernel?


The Linux Kernel is everywhere; the cloud infrastructure that serves the websites you visit, Android phones, computers, embedded systems, and so, the modern world depends on it! The Linux Kernel development is very active, and many software engineers are part of it. For example, in the last Kernel 5.4 release, at least
1804 programmers contributed with 14000 changesets. These are amazing statistics for an open-source project and might sound intimidating, but the more you get involved, the more opportunities you'll find in its development.

 

Now let's talk litmus tests.

 

Concurrency has always being a complicated topic, and sometimes CPU architectures differ greatly in their execution models and add to these issues that the Linux Kernel supports more than 30 CPU architectures. This is why the LKMM (Linux Kernel Memory Model) was proposed as a way to simulate executions in the Linux Kernel through litmus tests.

 

It's a tool that allows you to simulate the execution of concurrent code and determine if it can lead to undesired results or bugs by checking specific conditions to help you understand how the memory model outcomes in different architectures.

 

If you can't think how buggy concurrent code can go wrong, check this video about concurrency bugs on GPUs:

 

 

Now let's briefly check how litmus tests look like; this is how the most basic Message Passing (MP) litmus test:

 

P0(int *x, int *y)

{

WRITE_ONCE(*x, 1);

WRITE_ONCE(*y, 1);

}

 

P1(int *x, int *y)

{

int r0;

int r1;

 

r0 = READ_ONCE(*y);

r1 = READ_ONCE(*x);

}

 

exists (1:r0=1 /\ 1:r1=0)

In this test we can observe that a processor #0 (P0) writes values on the shared variables x and y and the processor #1 (P1) reads them, then we check a "buggy" condition: P1 observes the write to y (the last one) but not x. Contrary to common sense, there might be states where P1 can observe both writes, one of them or none as we are simulating the execution on a relaxed memory model, where the instructions might get executed in a different order than the intended by the programmer.

 

So now, the next step is to understand the tools that the Linux Kernel offers you to deal with this situation (e.g., memory barriers and synchronization primitives), so you can write code and test that it runs as you expected. This is the core of the project "Develop a suite of litmus tests for Linux-kernel spinlocks and read-write locks," and I'm working very hard to make it possible.


2019-12-30 10:23 am
Entry tags:

Internship update: Struggling and persevering

It's been 3 weeks since my Linux Kernel internship started. I'm pretty happy to keep working on this project, but not everything goes as smooth as you'd wish.

 

As I described in my previous post, this project involves much learning related to the LKMM and memory barriers, which is a topic that isn't easy even for regular kernel developers. It can be confusing at first as a programmer doesn't usually go through the underneaths of the instruction execution order or memory propagation on SMP systems. However, it's an important lesson to learn.

 

I was asked by the mentor to find a section of code on the Linux Kernel that use synchronization primitives, and I chose a WiFi staging driver, because there was the possibility I'd find bugs to fix at the same time. So I found a segment of code protected by a spinlocks (critical section) that seemed buggy because the code would return without unlocking:

 

416 int rtllib_wx_set_scan(struct rtllib_device *ieee, struct iw_request_info *a,

417 union iwreq_data *wrqu, char *b)

418 {

419 int ret = 0;

420 

421 mutex_lock(&ieee->wx_mutex);

422 

423 if (ieee->iw_mode == IW_MODE_MONITOR || !(ieee->proto_started)) {

424 ret = -1;

425 goto out;

426 }

427 

428 if (ieee->state == RTLLIB_LINKED) {

429 schedule_work(&ieee->wx_sync_scan_wq);

430 /* intentionally forget to up sem */

431 return 0;

432 }

433 

434 out:

435 mutex_unlock(&ieee->wx_mutex);

436 return ret;

437 }

source: drivers/staging/rtl8192e/rtllib_softmac_wx.c 

 

However, this wasn't the case, as it would schedule the scanning part that also contains the unlock. I don't know how performance would be affected by this decision when using spinlocks. However, it wasn't a bug in the code. The issue came later when I got asked to find all other parts of the code that share accesses to the same structures (that could cause concurrency issues, therefore the need for synchronization primitives), as I found that a lot of files and functions do it. 

 

I've spent considerable time in this task as I like to analyze the code and what's exactly doing the code, while writing extensive comments on top of the functions, so when having other pending tasks, like finding and analyzing usages of memory barriers such as smp_store_release/ smp_load_acquire and atomic operations, so I felt a bit overwhelmed (and unfortunately also got sick). 

 

My solution was to order priorities better and improve time management; I couldn't just read the entire driver code finding concurrent accesses, so I had to write about at least 3 of them. I also switched to the next topic (atomic operations) and reviewed past commits that my mentor sends me about fixing bugs. The fix was due to the misuse of memory barriers, as the developers incorrectly used the memory barrier smp_mb__before_atomic for a non-RMW operation. I wrote a small litmus test to understand the effects of this misuse, which would also demonstrate the need for a full memory barrier (smp_mb).

 

My main resources to continue with my learning process are:

  • Read Linux Kernel documentation
  • Read code!
  • If something wasn't clear, Google different sources or explanations.

 

 


2019-11-26 08:57 pm

Getting a Linux Kernel internship with Outreachy

Kernel Panic image
Source: Kernel panic by William Pina

Today I received the news "Congratulations on being accepted…", and I couldn't be more excited!

 

This journey began a couple of months ago after I read on Twitter about an organization called Outreachy, which promotes diversity in tech and accepts applicants from anywhere in the world for a paid internship to contribute to open source projects. It's also supported by some of the top companies in the tech world. It sounded too good to be true at first, and I had to read the Applicant Guide a couple of times before getting convinced I was eligible; I was! The projects couldn't be more amazing as I always wanted to contribute to Linux open-source projects. I applied and got accepted as an applicant.

 

Contributing to a Linux Kernel project

 

When I read the Linux Kernel projects, my first thoughts were how "hard" they all looked, but one project specially called my attention: "Develop a suite of litmus tests for Linux-kernel spinlocks and read-write locks." I always had something for everything low level, and I already did a small project about 16 and 32 bits CPUs and X86 assembly instructions. A project that would allow me a better understanding of the underneaths of concurrency in the Linux Kernel was something I needed.

 

The first part of applying to the Linux Kernel projects was about contributing with small patches to the staging drivers maintained by Greg Kroah-Hartman. These drivers contain code that doesn't meet the Kernel standards, so there are plenty of files that need to be reviewed, cleaned, refactored, and overall improved. Starting these contributions were a small project by itself. We had to compile and install the last release candidate of the (still under development) kernel 5.4, read technical lectures about the kernel drivers, get started with the tools we needed for the task, complete a contribution guide. The most important part was getting our contributions good enough to be accepted by Greg.

 

After days of reading code, fixing mistakes, improving my Git skills, and fixing even more errors, I got enough patches accepted for the next step and start with the specific tasks by contacting the mentors. These tasks finally seemed to be about the project topic, and more reading and activities ensued; the litmus tests, the LKMM, memory barriers, the ARM and PowerPC relaxed memory model, and the tools to simulate or test the memory model. Even if I felt a little overwhelmed by all this information at first, after some days, I started enjoying this part of the applicant process. I also got the tests working on an unused Raspberry Pi 3, so I could do some tests on an ARMv8 CPU.

 

Final Application

 

The mentor helped me solve my doubts or issues and asked me questions to test my (very recently acquired) knowledge. Fortunately, I did a good job, so I was invited to submit my final application, which would include information about my previous experience with open source projects and a timeline of my (possible) future activities as an intern. I completed this and waited while keeping in touch with my mentor as I continued testing and reading, and all the invested time finally paid off today!

 

My journey with Outreachy as an intern just started (soon actually), but my suggestions to new applicants would be:

  • Read the Applicant Guide and make sure you are eligible, and you can commit to the process.
  • Choose a project you are passionate about.
  • Make sure you can dedicate enough time to send an excellent contribution to the project.
  • Study everything you possibly can to understand the topic better, and when in doubt, ask mentors.