-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use flock() instead of UUCP-style locking for serial devices #1770
base: master
Are you sure you want to change the base?
Conversation
This replaces AcquireUUCPLockAndOpen() with AcquireLockAndOpen(). Depending on a compile-time option (--enable-uucp-locking), either flock() or UUCP-style lockfiles will be used as the underlying lock mechanism. The default is flock(). The plugins using these routines have been updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @taw10 ,
Thanks for this!
There are some lint issues if you could fix them please:
https://app.travis-ci.com/github/OpenLightingProject/ola/jobs/558598505#L1483-L1487
If possible a test for this would be awesome! I.e. lock a file and try and open it again and confirm it fails etc. Which will then let people know during make test if their chosen locking hasn't worked.
@@ -27,6 +27,7 @@ | |||
#include <signal.h> | |||
#include <string.h> | |||
#include <sys/stat.h> | |||
#include <sys/file.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has sys/file.h existed forever, so we're not breaking UUCP builds which don't have it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I'll investigate. However the header is checked by the configure script, and used without any #ifdefs in one other place (KarateLight).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, obviously that plugin breaking wouldn't have quite so much impact, but the plugin has been around for a while and no-one has complained so we're probably good...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
he flock call itself conforms to 4.4BSD, i.e. it's been around since 1997. So I think we're pretty safe here.
I could add guards around the #include (ifdef HAVE_SYS_FILE_H), but if it's worked so far then I don't see any reason to add the extra complexity. On inspection, I'm not actually sure why KarateLight includes this header at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah very good point about the fact it's worked!
I'm not sure about KarateLight either; @cpresser ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I am not sure what to answer here. My C++ is quite rusty, I have not used it in years and sure have no idea which include contains which functionality.
I recall that I did include file locking of the serial-port-device in the karatelight code. A quick search turned up this piece of code here:
https://github.com/OpenLightingProject/ola/blob/master/plugins/karate/KarateLight.cpp#L121
Does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #include <sys/file.h>
has been there since the first version of KarateDevice.cpp (f69a26b), but the flock
call is in a different file (KarateLight.cpp). I suspect the flock
calls were originally in KarateDevice.cpp and got moved.
It does mean, however, that there's not only the #include but also a real use of flock() in OLA already without any tests! Therefore we're probably pretty safe using it for all serial devices. Nevertheless, we'll keep the tests and the conditionals :)
There's not much to be gained from not compiling it.
RemoveLockFile -> RemoveUUCPLockFile GetLockFile -> GetUUCPLockFile
Two more things still to do (add a test and check the situation with sys/file.h), but this should cover most of the comments. The lint errors should be fixed (I may have introduced new errors, let's see). I've also renamed the top-level lock/unlock routines, because "ReleaseLock" seemed too vague. Now it's ReleaseSerialPortLock. Note that in the flock() case, the unit test will really only test the TIOCEXCL locking, since that will prevent the open() call from succeeding before we try to flock(). Working round that would introduce some nasty complexity. But, as long as one of the locking mechanisms works, it's OK. With UUCP lockfiles, the check is done before we try to open the port itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more tweaks
Great thanks.
Only the one! 😆
As mentioned in the comment, won't this lock work on any file descriptor? Maybe the UUCP stuff is only implemented for devices?
Exactly, I'm more interested in catching an issue during CI/build and around our logic than I am bothered about thoroughly testing this particular system feature. If we prove we can't open a file twice due to the lock then that's a good test as far as I'm concerned.
Yep, understood. |
Unit test is still to come (I have to get myself acquainted with the test system first). One more question: would you prefer to have these commits squashed down to at least three (Codespell fix, unit test, everything else)? The overall diff is very small and understandable. Or do you prefer to have the complete history? |
I've added a test, but it's not very satisfactory for a couple of reasons. Let me know what you think! The problem is that we don't have a real serial port to test with, and the TIOCEXCL lock won't work with a normal file. To avoid creating a new file, I used the ChangeLog file for the test since it's always present at the top level of the project. This feels bad to me, but the alternatives all seem less portable. Maybe you or someone else has a suggestion. To prevent the normal file from failing the test, I changed the two locking routines to not fail if |
Co-authored-by: Peter Newman <[email protected]>
Co-authored-by: Peter Newman <[email protected]>
Co-authored-by: Peter Newman <[email protected]>
The idea of poking real hardware as part of
I'll implement this approach. This way, we have a new locking mechanism (flock) and a test that it works, the UUCP stuff is still there as a fallback, and the TIOCEXCL is on top of either without a test. This is already a big improvement over the old situation (legacy UUCP locking without a test). The KarateLight experience (see above) suggests that the UUCP locking will never be used, so the idea of testing UUCP might be purely theoretical anyway.
Unfortunately yes. I'll add some cleanup so that the file gets removed in any case, and a message to make it clear that the file already exists. However I want to avoid deleting a random user file in the event that my choice of filename wasn't "obscure" enough. Update pending... |
UUCP locking: performed if requested at configure time flock(): performed if the flock call is available (always?) ioctl(fd, TIOCEXCL): performed if sys/ioctl.h is available It's possible that all three will be done, but only one of them really has to work. If any of the selected methods fail, however, the port opening will be aborted.
Ok, AcquireLockAndOpenSerialPort() now does this:
If any stage fails, it will bail out, closing the port and releasing UUCP lock if applicable. So, all of the available locking mechanisms (potentially three of them, see below) have to succeed. The bailout logic is a little more complicated than I'd like, because flock() happens after opening, whereas UUCP happens before. AcquireUUCPLockAndOpen() is still available, although no-one should be using it. It does this:
I've split off Each step prints at least an OLA_INFO on success, so you can see which locks are used. If configured with The test program (SerialLockTester) tests that the flock() call succeeds, but not that it prevents a second lock. That would be a lot of trouble to test, because the second flock would have to be from a different process. I consider it more a simple test that the routine really exists - it doesn't seem unreasonable to rely on it doing what the specification says it should! It now uses a PID-based filename, to prevent one failure from causing subsequent ones. UUCP locking and TIOCEXCL are not tested at all because of the difficulties noted above, but it's still overall an improvement on the previous situation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor tweaking
I think we do already as part of our autostart tests TBH (but I can see your point)...
I think that code should already exist as we already do essentially that in some of the plugins that will use this new locking!
Yeah fair point.
Lovely thanks. That should be good enough, as long as the test fails with a non-cryptic message it shouldn't generate too many issues (judging by the auto-start stuff when it's already running as a service which obviously breaks more often). |
include/ola/io/Serial.h
Outdated
* Depending on the compile-time configuration, this will use either flock() | ||
* (the default) or UUCP locking. See: ./configure --enable-uucp-locking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this now doing both/all?
include/ola/io/Serial.h
Outdated
* If UUCP locking was used (see AcquireLockAndOpenSerialPort()), the lockfile | ||
* will be removed (but only if the PID matches). | ||
* | ||
* Does nothing if flock() was used (see ./configure --enable-uucp-locking). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again can't this do all/more now?
Amazing!
Cool.
👍
Yeah given some of the random architectures we've compiled on, they might just not have the KarateLight plugin included/compiled...
Ah yeah that would be quite a hassle! Although we could have a shell script wrapper which runs two copies if that's easy enough (like we do for some Python stuff), but as you say may not be worth the hassle.
Yeah sounds great! |
Co-authored-by: Peter Newman <[email protected]>
Co-authored-by: Peter Newman <[email protected]>
Co-authored-by: Peter Newman <[email protected]>
Co-authored-by: Peter Newman <[email protected]>
Ok, some updates in addition to your suggested changes (which were all applied):
|
Co-authored-by: Peter Newman <[email protected]>
The CI is failing, but the error doesn't make any sense to me ( |
@peternewman should this go into |
This replaces AcquireUUCPLockAndOpen() with AcquireLockAndOpen(). Depending on a compile-time option (--enable-uucp-locking), either flock() or UUCP-style lockfiles will be used as the underlying lock mechanism. The default is to use flock(). The configure script will fall back on UUCP locking if flock() is not found.
It would also be possible to preserve the API by naming the top-level open/lock routine AcquireUUCPLockAndOpen, but then the routine name would not be an accurate description. I've updated the plugins using these routines (only two of them, StageProfi and UsbPro).
ReleaseUUCPLock() is still present, becoming a no-op if flock() is being used. With flock(), the lock is automatically released when the file descriptor is closed.
In practice, the TIOCEXCL lock (when available) is much more helpful, because it prevents any other process from opening the device and doesn't rely on all processes using the same locking scheme. This lock is still used when flock() is available.
Fixes #1189, #1674 and many related problems on various platforms (e.g. running in a Toolbox container on Fedora Silverblue).