Due to the nature of my work, I often end up working with various HTTP APIs. While many are incredibly well documented and are clear and intuitive, sometimes that just isn’t the case. Often at this point I need to inspect what an SDK in a different language is doing. Other times I’m sure I’ve got it right, but perhaps I’ve missed an encoding step somewhere and my request is slightly incorrect. Either way, the best way to debug is often to look at the raw HTTP request I’m sending (or even the response I recieve).
One of the simplest ways of doing this is when you are using the requests
library. I use this snippet for debugging quite often:
def raw_request(request: requests.Request) -> str:
request = request.prepare()
output = f"{request.method} {request.path_url} HTTP/1.1\r\n"
output += '\r\n'.join(f'{k}: {v}' for k, v in request.headers.items())
output += "\r\n\r\n"
if request.body is not None:
output += request.body.decode() if isinstance(request.body, bytes) else request.body
return output
To use it, do the following:
request = requests.Request("POST", "https://example.com", json={"Hello": "World"})
print(raw_request(request))
You’ll see a raw HTTP request that looks something like:
POST / HTTP/1.1
Content-Length: 18
Content-Type: application/json
{"Hello": "World"}
Now, of course, this depends on the internals of requests
never changing. It also means that you need to be in control of creating the request. What about the times when you aren’t?
A “debugging proxy” is a web proxy which logs the HTTP(S) traffic between your computer and the rest of the world. With these logs, you can then inspect the requests made and figure out where/if things are going wrong.
Some options for debugging proxies are:
Name | Windows | Mac | Linux |
---|---|---|---|
Wireshark | ✅ | ✅ | ✅ |
Fiddler | ✅ | :x: | :x: |
Charles | :x: | ✅ | :x: |
Burp Suite | ✅ | ✅ | ✅ |
HTTP Toolkit | ✅ | ✅ | ✅ |
Personally, I tend to use Fiddler on Windows most of the time, but on a Mac I’ll use Charles. Wireshark can do everything, but with that comes a huge amount of complexity that I find is overkill for HTTP debugging. Particularly when it comes to debugging TLS encrypted sessions.
Choose whichever one you want and fire it up. Now, all you need to do is set a couple of environment variables and you’ll be debugging your requests in no time.
To tell the libraries to use the proxy rather than send the requests directly to the desired server, you need to set some environment variables. These are http_proxy
, HTTP_PROXY
, https_proxy
, and HTTPS_PROXY
.
Every tool will use its own port for the proxy, but it’s usually 8888, 8080, or 8000 by default.
If we use Fiddler, which is on port 8888, as an example you’d set these environment variables as follows:
http_proxy=http://127.0.0.1:8888
HTTP_PROXY=http://127.0.0.1:8888
https_proxy=http://127.0.0.1:8888
HTTPS_PROXY=http://127.0.0.1:8888
Note, you can also do this in your Python code:
proxy = 'http://127.0.0.1:8888'
os.environ['http_proxy'] = proxy
os.environ['HTTP_PROXY'] = proxy
os.environ['https_proxy'] = proxy
os.environ['HTTPS_PROXY'] = proxy
Now, open up Python and make a request:
import requests
response = requests.get("http://example.com")
print(response.text)
You’ll get a lovely stream of HTML coming back. If you look in your tool (again using Fiddler as an example), you can see the following:
And if we look at the inspectors, we can see the request sent at the top, and the response recieved at the bottom:
There’s a problem though. Try again with this code:
import requests
response = requests.get("https://example.com") # Note the 'https'
print(response.text)
This time you’ll get an SSL error. Since the proxy is in the middle, it’s trying to decode the request you send (you may have to turn this feature on in your proxy) so it can display it, before then forwarding on to example.com
. However, the library has correctly realised that your connection is not secure any longer and throws an exception.
These tools generate their own root certificates and use those when forwarding the requests. If our tools are told about these certificates, and that they can be trusted, then we can make TLS encrypted requests and the proxy will be able to debug and display them.
To do this, we first need to export the certificate from our tool. With Fiddler, you can get it from a link by visiting http://127.0.0.1:8888
. For Burp Suite, you open the tool, visit the “Proxy” tab, and select “Import / export CA certificate”. Other tools have similar mechanisms.
Now, we need to convert this certificate into the PEM
format. Depending on the tool, how it exported it, and the OS you are on, the instructions here will vary. Generally searching for “Convert [ext] into PEM on [OS]” will get you what you need.
Once you have your root certificate as a PEM file, you just need to tell the various tools about it with yet another environment variable. requests
uses one called REQUESTS_CA_BUNDLE
. httplib2
uses HTTPLIB2_CA_CERTS
. But both just expect a path to the file. e.g.
HTTPLIB2_CA_CERTS="/path/to/cert.pem"
REQUESTS_CA_BUNDLE="/path/to/cert.pem"
Again, this can be done in Python code.
Now if we try the code from before:
import requests
response = requests.get("https://example.com")
print(response.text)
You will see the HTML in your console, and the request will appear in your debugger.
Viewing HTTP(S) requests and responses is just one small part of what these tools can do. They can be configured to automatically drop certain requests, respond with pre-canned information to others, and even run scripts to process and respond. I’ve even written about Fiddler response rules before. Each tool has different features and capabilities, but when you’ve picked one, it’s well worth knowing what they can do. And obviously, this isn’t limited to Python. Other tools, languages and libraries will be able to be debugged this way. Sometimes you’ll get lucky and they’ll respect the system settings. Other times you’ll need to do what you did here with the environment variables.
]]>requests
, pylint
, or black
. However, 13 are our own and shared with others. Of those 13, 9 are open source and available for the general community. A further 3 out of the 50 packages we use are my own creation from outside work. While some are personal, and some were created at work, all were created by me.
Without these packages, Outlook iOS wouldn’t be where it is today. Personally, I think these various tools were paramount to allowing developers to focus on what really matters: Developing the app. Knowing that these tools were available and could handle the various day to day issues removes a massive burden and improves the results we see.
These various tools have been so successful that many other teams at Microsoft contact me asking to use them. I have to admit that it gives me the greatest pleasure when I can point out that not only can they use them, but they are open-source so anyone can use them.
Here is a brief description of each of these tools that I created and how you can use them for your app/library/etc. (Note: order does not imply importance)
First up is a personal creation, deserialize. This library takes a dictionary or list and a type and creates an instance of that type using the data supplied.
For example, if you want to convert this data:
{"a": 1, "b": 2}
Into an object with a
and b
as properties, you’d have to do something like this:
class MyThing:
def __init__(self, a, b):
self.a = a
self.b = b
@staticmethod
def from_json(json_data):
a_value = json_data.get("a")
b_value = json_data.get("b")
if a_value is None:
raise Exception("'a' was None")
elif b_value is None:
raise Exception("'b' was None")
elif type(a_value) != int:
raise Exception("'a' was not an int")
elif type(b_value) != int:
raise Exception("'b' was not an int")
return MyThing(a_value, b_value)
my_instance = MyThing.from_json(json_data)
With deserialize
, all you need to do is:
import deserialize
class MyThing:
a: int
b: int
my_instance = deserialize.deserialize(MyThing, json_data)
deserialize
will run all the checks for you and give you a nice new shiny object from it. It of course works to any depth of types, and not just primitives.
The reason this comes up as #1 is because it a foundational building block of so many other packages in here. If I ever consume from a REST API, load data from disk, or even query a database, you can be sure I’ll have at the very least considered using this package to make it easy and error free.
protool removes all the pain from dealing with provisioning profiles. Instead of being mysterious binary files, protool
makes them easy to use, understand, and work with. Some examples of what it can do:
protool diff --profiles /path/to/profile1 /path/to/profile2
protool read --profile /path/to/profile --key UUID
security
command: protool decode --profile /path/to/profile
These commands are actually based around the full Python API it provides. Some examples:
import protool
profile = protool.ProvisioningProfile("/path/to/profile")
# Get the diff of two profiles
diff = protool.diff("/path/to/first", "/path/to/second", tool_override="diff")
# Get the UUID of a profile
print profile.uuid
# Get the full XML of the profile
print profile.xml
# Get the parsed contents of the profile as a dictionary
print profile.contents()
Personally, the start feature of protool is as a diff driver for git. Normally if you change profiles you see “Binary files differ” from git. With protool
you can edit your git config (at any level) and add:
[diff "mobileprovision"]
external = protool gitdiff -g
This will let you see the differences in XML format. However, that on its own isn’t particularly helpful. You could just have easily used security cms -D -i
on in the config and it would do the same thing. The real power is in being able to ignore keys. For example:
[diff "mobileprovision"]
external = protool gitdiff -i TimeToLive UUID -g
This will ignore the time to live value, as well as the UUID in the diff. You know those will be different between any two profiles, so why bother cluttering your diff with them?
Dealing with localization can be tough, but dotstrings makes it just that little bit easier.
This tiny tool does one thing and one thing only: It reads your .strings
files. Here’s the entirety of what it does:
import dotstrings
entries = dotstrings.load("/path/to/file.strings")
for entry in entries:
print("Key: " + entry.key)
print("Value: " + entry.value)
print("Comments: " + "\n".join(entry.comments))
Why is that useful you ask? Well, it allows you to test your strings easily! We use it directly for a bunch of checks, but you’ll see later how we integrate it with another tool for even better testing.
One of the most annoying and difficult things to comprehend as an Apple developer is the Xcode project format. Testing it to ensure that developers haven’t accidentally broken anything, or moved files where they shouldn’t be, etc. can be a real nightmare. Especially when coupled with the fact that the pbxproj
format is inscrutable to most. This is where xcodeproj comes in. It aims to solve all of those woes. By simply running:
import xcodeproj
project = xcodeproj.XcodeProject("/path/to/project.xcodeproj")
you now have a nice, easy to understand, simple to use, project object which you can test directly.
Let’s look at a trivial example where you are sick of seeing Xcode have those files highlighted in red because they exist in the project but no longer exist on disk. How would you make sure no one is accidentally committing changes with that? Easy!
import xcodeproj
project = xcodeproj.XcodeProject("/path/to/project.xcodeproj")
for item in project.fetch_type(xcodeproj.PBXFileReference).values():
assert os.path.exists(item.absolute_path())
This library makes Xcode projects something which can be part of your code reviews and no longer some mysterious black box where people automatically approve changes to pbxproj files.
Another personal creation here. xcresult does exactly what it sounds like. It lets you work with xcresult bundles. When you buiild, run tests, etc. Xcode will generate an xcresult bundle with the, you guessed it, results of the operation in there. Reading it though to get the data out is a whole different story.
For example, let’s say you run snapshot tests and one is failing. You know there are two images in there somewhere, how do you get them out? There’s absolutely no hint in the logs. Thankfully, it’s relatively easy:
results_bundle = xcresult.Xcresults(results_bundle_path)
attachments_path = "/some/output/folder"
os.makedirs(attachments_path, exist_ok=True)
results_bundle.export_attachments(attachments_path)
Now all the images, etc. that are in this bundle are available as PNG images. These can then be easily surfaced to what ever CI system you are using so that developers can easily see exactly what went wrong. For example, if you use Azure DevOps, you might see something like this attached to your build:
Dealing with simulators can be tricky at the best of times. So many questions around things like “Do you wipe them after each test run?”, “If so, how?”, “How do I create a simulator for a test for a particular device?”, etc. isim, and you might be seeing a pattern here, tries to make that as simple as possible.
Many of you reading this will be familiar with the xcrun simctl
command. If you are working in a system where Bash works for you, then you don’t need to read any further. If you are a Python shop, then isim will be a life saver. It’s essentially a wrapper around that command to make it as easy to use as possible, while being easy to use if you are already familiar with the command.
For example, xcrun simctl list runtimes
becomes isim.Runtime.list_all()
. And in general, xcrun simctl do_thing [DEVICE_ID] arg1 arg2
becomes:
device = isim.Device.from_identifier(DEVICE_ID)
device.do_thing(arg1, arg2)
If your CI is Python based and you aren’t using isim, then either you are making life harder for yourself, or you have a fantastic solution of your own I’d love to know about!
Localization is incredibly difficult. In the best case scenario, you write some strings, send them off to translators, get them back and ship them. But what if there was a mistake? What if you sent the string Hello %@!
where you’d replace %@
with the persons name, but your French translators send back Bonjour!
with no token? Well, at runtime, your app is going to crash. Ok, sure, it’s unlikely that this would happen, but what if you have 2000 strings in your app? Then it’s 2000 times more likely to happen? What if you support 70 languages? Then it’s 140,000 times more likely to happen! At that scale, mistakes happen. How do you catch them? With localizationkit. This tool is a suite of tests to ensure that your localized strings are the best that they can be. What sorts of things can it check for?
Hello %1$@, the weather is %2$@
instead of Hello %@, the weather is %@
)The stocks went up to 100 %*
)This tool alone has saved us countless times from runtime crashes.
I mentioned above, that dotstrings
integrates with other tools. This is one example. localizationkit
is platform agnostic. It takes in a string “collection” where each string consists of a key, value and comment. Combining the two to test is trivial:
bundle = dotstrings.load_all_strings("/path/to/table.strings")
strings = [localizationkit.LocalizedString(string.key, string.value, string.comment, "en-GB") for string in bundle]
collection = localizationkit.LocalizedCollection(strings)
results = localizationkit.run_tests(config, collection) # `config` lets you set various parameters
failures = [result for result in results if not result.succeeded()]
assert len(failures) == 0, f"Encountered failures: {failures}"
I know, it’s a super similar name to the previous entry, but I wasn’t responsible for the naming scheme, just the code! Out of all of the examples I have here, this is the only one which is a derivative of some earlier work. This work was done by one (or more) of the engineers at Acompli and continues to this day, just in a significantly different form.
LocalizedStringKit is unique in this list as it’s not only a Python program. It has a Swift/Objective-C counterpart too: https://swiftpackageindex.com/microsoft/LocalizedStringKit This tool makes it easier than ever for developers to localize their apps without even needing to think about it!
Normally, the flow to localized a string goes something like this:
label.text = "Your account was successfully added!"
You get my point.
With LocalizedStringKit, you do this:
label.text = LocalizedString("Your account was successfully added", "Shown to the user in an alert when they've added an account to the app, letting them know everything was successful")
localizedstringkit --path /path/to/my/project/root --localized-string-kit-path /path/to/my/project/root/LocalizedStringKit
(which you are obviously going to provide a wrapper/alias for which is easy to remember)That’s it. You can add a check in your CI to ensure no one forgets to run the generation script either.
It works by taking a hash of the English string as the key, which is therefore deterministic. Developers lives are significantly simpler and less error prone now.
Interacting with the system keychain from the command line can be a nightmare at best. We all have to do it to install certificates, secrets, etc. and it never gets any easier. So let’s bypass the CLI entirely and use keyper` in Python instead.
Getting a password is as simple as password = keyper.get_password(label="my_keychain_password")
Installing a certificate is just 3 lines of code:
with keyper.TemporaryKeychain() as keychain:
certificate = keyper.Certificate("/path/to/cert", password="password")
keychain.install_cert(certificate)
Of course, you can install to the system keychain, you just need to make sure it is unlocked first.
For this tool, if you are handling certificates or passwords, there is simply no easier way to get them into the keychain.
Our first two here are Microsoft stack specific, so if you use something else, feel free to skip.
There’s no point in dressing this one up. appcenter is a Python wrapper around the App Center APIs. There is an Open API verison, but we found that the code it generated was difficult to understand and use. appcenter
was born from that. Here are some examples of how it works:
# 1. Import the library
import appcenter
# 2. Create a new client
client = appcenter.AppCenterClient(access_token="abc123def456")
# 3. Check some error groups
start = datetime.datetime.now() - datetime.timedelta(days=10)
for group in client.crashes.get_error_groups(owner_name="owner", app_name="myapp", start_time=start):
print(group.errorGroupId)
# 4. Get recent versions
for version in client.versions.all(owner_name="owner", app_name="myapp"):
print(version)
# 5. Create a new release
client.versions.upload_and_release(
owner_name="owner",
app_name="myapp",
version="0.1",
build_number="123",
binary_path="/path/to/some.ipa",
group_id="12345678-abcd-9012-efgh-345678901234",
release_notes="These are some release notes",
branch_name="test_branch",
commit_hash="1234567890123456789012345678901234567890",
commit_message="This is a commit message"
)
What more is there to say? If you use AppCenter, this library will be a life saver.
Just like above, there is a Python wrapper around the Azure DevOps (ADO) APIs, but it is difficult to understand and use, and makes reading code reviews significantly more complex as it is difficult to understand intended behavior. Enter simple_ado. The ADO APIs are expansive and simple_ado
can’t possibly cover them all (the clue is in the name: simple
), but it covers the majority of what you would ever need as an iOS/macOS developer. You can manage builds, pull requests, work items, commits, teams, identities, security, plus a ton of other things.
Taking hold of your CI is of the utmost importance for any team. If you use Azure DevOps, this is the tool you want to use. Point me at a different CI and I’m going to create the same thing again for it.
There’s something so satisfying about saving the best for last. There isn’t an iOS developer out there who hasn’t heard of Fastlane. If you want to automate your release process, Fastlane is the tool to use. Unless you aren’t Ruby devs… At which point where do you turn? A few years back, Apple announced they were opening up the App Store Connect APIs. The capabilities don’t yet match what Fastlane is capable of (which uses web scraping if an API isn’t available), but it covers the majority of cases that any developer would care about.
With asconnect, you can easily:
Plus a bunch of other things. Outlook switched from Fastlane to asconnect almost 2 years ago and has never looked back. No more issues dealing with Fastlane not working because Apple changed a page layout. The APIs work. Every. Single. Time.
As an example of how easy it is to use, let’s look at uploading a build and creating a new app store submission:
import asconnect
client = asconnect.Client(key_id="...", key_contents="...", issuer_id="...")
# Upload the build
client.build.upload(
ipa_path="/path/to/the/app.ipa",
platform=asconnect.Platform.ios,
)
# Wait for it to finish processing
build = client.build.wait_for_build_to_process("com.example.my_bundle_id", build_number)
# Create a new version
version = client.app.create_new_version(version="1.2.3", app_id=app.identifier)
# Set the build for that version
client.version.set_build(version_id=version.identifier, build_id=build.identifier)
# Submit for review
client.version.submit_for_review(version_id=version.identifier)
It’s as simple as that. You will no longer have to have someone do these steps manually every week if you aren’t already using a similar tool. And if you are using FastLane, while a phenomenal tool that asconnect can never hope to compete with, you won’t have to worry about it breaking because Apple made some changes to a random web page.
]]>It turns out that it’s not that much harder. You just need to send a POST to
http://127.0.0.1:7071/admin/functions/MyTimerTrigger
with a body of {}
and
the content type set to application/json
.
That’s all there is to it. If you miss out the body, or don’t do a POST, you’ll get the function information back instead of triggering it.
]]>Gemfile
, Podfile
, package.json
or Cargo.toml
), there’s a corresponding .lock
file. It’s not always clear what the purpose of these files are, and whether or not they should be checked in to your repo.
I’m going to use CocoaPods as the example for this, but most package managers are the same and the same logic applies.
Here’s an example Podfile
:
platform :ios, '10.0'
source 'https://github.com/CocoaPods/Specs.git'
target 'MyApp' do
pod 'SwiftLint', '~> 0.29.1'
pod 'OCMock', '~> 2.0.1'
end
This file basically says, let’s install a version of SwiftLint that is at least 0.29.1 and up to, but not including 0.30.0. Similarly, it wants OCMock from at least 2.0.1 up to, but not including 2.1.0. Different tools use slightly different operators for these sorts of things, so make sure you are using the right one for your tool.
When we run pod install
, CocoaPods analyses this file, finds the dependencies, figures out what versions it can possibly install, and then does so. By default, most package managers will take the latest possible version that meets the requirements specified. So if there was a version 0.29.7 of SwiftLint as the latest available one, that’s what would be installed. OCMock is still on 2.0.1 so that’s what you get.
You then go and create some project using these dependencies, check in your Podfile
and everything looks good. Someone else on your team checks out the commit, and runs pod install
and they end up with OCMock 2.0.1, however they get SwiftLint 0.29.8. That could be problematic. In theory, the two versions should be compatible, but despite everyone’s best efforts, mistakes can still be made. How do you make sure that everyone on your team gets the same version as you? Well, that’s where the lock file comes in.
When you run pod install
, not only does it resolve the dependencies and install, it generates a Podfile.lock
. This file contains the exact versions of all the dependencies (and their dependencies) that were installed. If you run pod install
and there is a Podfile.lock
, then instead of resolving the dependencies and taking the latest possible, it instead looks at the versions in the lock file and install those. Now, if you check in your Podfile.lock
, when your team mates run pod install
, they get the exact same versions that you have.
So, it’s clear where lock files help. But do you always need them? What if you pin your versions exactly. Instead of pod 'SwiftLint', '~> 0.29.1'
you use pod 'SwiftLint', '0.29.1'
. If that’s the case, you will end up with exactly the same versions, right? That’s true for the most part. If you specify the versions explicitly, then you will get the same versions. However, if you have 30 dependencies, it can be annoying to update each one manually in turn. Using SemVer you can pin compatible versions of each tool, get the most benefits, and then, importantly, just run pod update
and it will resolve the latest possible dependencies and install them, updating your lock file as it goes. So, you don’t have to use a lock file, but it has its advantages.
One additional benefit of using lock files, even if you pin exact versions, depends on your exact tool. Lots of package registries let you change the version already in place. That means that some developer might upload version 1.2.3 of their tool, but realise they made a mistake. Then instead of pushing 1.2.4, they just “fix” 1.2.3. This means that you could have two different versions of 1.2.3. In theory, this shouldn’t happen, but it does (and sometimes maliciously). Using a lock file lets you verify that the version you have is the same as the version others have since the hash of the dependency needs to match. But this depends on your tool.
]]>{
"name": "Hodor",
"age": 42
}
A naive approach might just do something like:
def get_data():
data = requests.get("https://example.com")
return data.json()
Then when you want to access the data, you do:
name = data['name']
age = data['age']
This works, it’s simple and it’s easy. It also conforms to one of the unwritten rules of Python: It’s better to ask forgiveness than for permission. i.e. Hope that the values are there, and deal with it if not, rather than checking that they are there first.
Python isn’t the only language I write day to day. It’s shared between it, Swift, Objective-C and C#. One of the features that both Swift and C# have is the ability to deserialize JSON directly into objects. That lets you define an object which looks like the response you expect and it will parse directly into it. Here’s a Swift example:
struct Person: Decodable {
let name: String
let age: Int
}
guard let person = try? JSONDecoder().decode(Person.self, from: responseData) else {
print("Error: Couldn't decode data into Person")
return
}
print(person.name)
print(person.number)
This pulls out the data, type checks it and places it into a newly created object for you. This implementation lets you handle missing keys by using optional types, you can specify alternative identifier mappings if you want your property to be named differently than the API key, etc. You can be sure that whatever comes back from the API definitely conforms to Person
. C# has a very similar ability using the Json.NET library.
So, how did I want to return the data from my library? Well, I decided that if I could parse it into an object, perform the validation, type checking, etc. this would result in a safer library for users to consume. There would be less surprises in store, and it would be easier to use. So I started writing out my Python code to handle this:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
@staticmethod
def from_json(json_data):
name = json_data.get('name')
age = json_data.get('age')
if name is None:
raise Exception('No "name" was found in the data')
if age is None:
raise Exception('No "age" was found in the data')
if not isinstance(name, str):
raise Exception('"name" was not a string')
if not isinstance(age, int):
raise Exception('"age" was not an int')
return Person(name, age)
person = Person.from_json(json_data)
Phew… that’s a lot of boilerplate, but it gets the job done (lets not get into namedtuples, dataclasses, etc.). It checks that the values are there. It checks that they are the correct type. The end result is a nice, safe and clean type that the user can use without any surprises.
The problem is that I am explicitly checking, rather than letting the user handle it if it goes wrong, and this goes against the rule I mentioned above. However, clearly when writing an API wrapper, the burden of validating the responses is on the wrapper and not the end user. This clearly shows that the permission vs forgiveness rule is not right in all cases1. This is just one case though, there are many others.
So, Python devs, before you repeat the line about permission and forgiveness, pause for a moment and actually think about it. Is that actually what’s best, or is it just something you’ve believed without knowing why?
The code above is ridiculously verbose and it’s just checking two properties. I was dealing with a lot more. The validation code got insane. That doesn’t even include the ones where I wanted to do things like convert a Unix timestamp to a datetime.datetime
, etc. Inspired by the Swift and C# solutions, I went and created deserialize2 which takes advantage of type hints. By using this, the solution above can be condensed to:
import deserialize
class Person:
name: str
age: int
person = deserialize.deserialize(Person, json_data)
That does all the same checks as above, but is obviously much easier to read and maintain. It also lets me continue working on the wrapper without feeling like my soul is being sucked out as I validate the 400th value.
Even the creator of Python thinks that it is a bad rule: https://mail.python.org/pipermail/python-dev/2014-March/133118.html ↩
I initially looked for an existing library that did the same thing, but couldn’t find one. Sure enough though, as soon as I had something that did what I needed, I found what I was looking for in the first place: https://git.iapc.utwente.nl/rkleef/serializer_utils Don’t just blindly go with my implementation. Have a look at both solutions and use the correct one for you. ↩
Git is one of the most difficult tools for people to wrap their heads around. Interspersed with the git alias blog posts (yes, I do love irony, why do you ask?) are the git tutorial posts. Sure, years on it all makes sense to us, but for beginners, understanding what each command does is the most important thing.
Aliases break that. Well, that’s not true. YOUR aliases break that. Here are 4 different categories of alias and reasons why each are bad.
The first is the set of typing savers:
git p
-> git pull
git co
-> git checkout
These teach new users to use these commands rather than the real ones. When it comes to learning about something else, there is a separation between the commands they are learning how to use more effectively and the commands that they know how to use. It can be difficult to reconcile that pull
is p
each time you see it. Sticking with the original commands, at least until you are quite comfortable with git teaches you how to think about it in a consistent way which is supported by the rest of the community.
In slot number 2 we have the aliases which string multiple commands together:
git cm
-> git add -A && git commit -m
git cp
-> git commit && git push
These commands can simplify the flow for beginners, I won’t deny that. But they do it at a cost. In this case, it can be difficult to reason about what is going on under the hood when a user runs git cp
. It doesn’t force you to face the idea of there being a staging area, committed code and pushed code. Instead, it treats it more like SVN does, where you have changes locally or you have them on the server. It ties up the usefulness of git into a less functional wrapper, and makes it harder to learn the individual components.
For number 3, we have the commands which provide flags to existing commands:
git rbc
-> git rebase --continue
git fix
-> git commit --amend --reset-author --no-edit
This is the hardest to argue against. The flags can be hard to remember. That’s a fact. Having aliases helps with these. In this case though, it takes from you the configuration possibilities, as well as understanding of individual components as above. The fix
command for example makes it easy to make a change and roll it into the previous commit. However, without the flags, you don’t really understand the mechanism by which it does this. Not to mention the fact that if you are ever on a machine without your aliases, you’ll have no idea what you are doing.
The final category is workflow specific aliases:
git up
-> git pull --rebase --prune $@ && git submodule update --init --recursive
git wip
-> git add -A; git ls-files --deleted -z | xargs -0 git rm; git commit -m "wip"
git unwip
-> git log -n 1 | grep -q -c wip && git reset HEAD~1
Each of these are tailored to a specific person or teams workflow. They may not be useful to a new user and pushing them to use it is giving them a nice shiny new hammer and saying “Look at all these things which can be treated like nails!”. A user needs to decide their own flow. Git lets you customise it with aliases so you can adapt your tooling to your flow. It’s a terrible idea to ignore all that and make your flow fit someone else’s tools when you have your own ones.
So, I’m not against git aliases. In fact, 2 of the aliases above are from my own config files. What I’m against is confusing users with new things, and hiding the truth from them. Once a user has used git on it’s own and understands their own flow, that knowledge, combined with the knowledge that you can add lines like:
alias_name = some_sub_command -with -flags
other_alias = !git whatever | other_command
is enough for them to build their own tools, customised to their own knowledge that suit their own flows.
Let users learn their own aliases.
P.S. Except log aliases. Share those as much as you want. No one is going to remember how to write those each time.
]]>A few weeks ago a post appeared on the 1Password blog. It didn’t received much traction until today though. Essentially, the post lays out a new version of 1Password for Windows which will make the current file formats read only and force the user to pay a subscription fee to store their passwords on 1Password’s servers. Now, I want to make it abundantly clear that I don’t have a problem with their fee model being subscription based. In fact, I practically welcome it. I paid for 1Password 3 or 4 years ago and got a licence for Windows, OS X, Android and iOS. I gave them something like £80 for all that, once. This is a company who I trust with my security. I don’t want them to be scraping by and trying to figure out how to pay their employees. I don’t want them cutting corners in order to get sales. Keeping the company healthy is in my best interests as a user. A subscription model would guarantee income for them, and would help me sleep a little easier at night.
So where is my problem? It’s the fact that I no longer have control over my vault. My passwords will automatically be synced to 1Passwords servers and will no longer be in my control. “But your passwords are encrypted before being sent to them!” I hear you cry. Sure, I’m not denying that. I have faith that AgileBits have implemented this mechanism safely and securely. The problem isn’t my passwords on their servers. It’s everyone’s.
Servers full of passwords are wonderful targets for thieves. Now imagine servers full of password vaults. What Fort Knox is to a bank robber, 1Password’s servers will be to black hats. AgileBits have just painted a target on their backs and don’t even realise it.
So let’s say that someone does compromise their servers (I said I had faith in AgileBits, but no one writes perfect code all the time), what happens next? I have a strong master password, so I know that my passwords would be safe. But what about the thousands of people who don’t have strong master passwords? Their vaults are essentially ripe for the picking. Now, they do have a secret key which is said to never leave your device and your encryption key is derived from your master password and your secret key. This is a great feature that I don’t want to put down. However, if you are a black hat, have 100,000 password vaults, 10% of which have weak passwords, and no secret keys, what do you do? You start working on getting those keys. If a user is using a weak master password, you can almost certainly assume their general security habits aren’t great. A targeted attack on any of these people would have a high chance of success.
Ok, enough about the security. What else is wrong with this approach? Well, for a start, I now must have a network connection in order to be able to add my 1Password vault to a machine. I can’t keep it on a USB drive and use that on each machine. If someone has an air-gapped machine, then it’s no use.
Next up is the fact that if I want to continue getting updates, some of which might be critical for security, I need to upgrade, which means my vault is made read-only. I might be left with a choice of being secure, and being able to use my password manager. I can’t speak for everyone, but it really feels like I’m being left out in the cold on this one.
The final thing is the worst. It’s the fact that it really seems like AgileBits just doesn’t care about it’s users any more. I would guess that a significant number of people (myself included) use 1Password over competitors like LastPass or DashLane because it doesn’t sync to a central server. This feels like a money grab rather than something that has been done to make their users happier. As I mentioned, I’m fine with a subscription for the licence, just please leave me in control of my passwords.
]]>1. Do you use source control?
2. Can you make a build in one step?
3. Do you make daily builds?
4. Do you have a bug database?
5. Do you fix bugs before writing new code?
6. Do you have an up-to-date schedule?
7. Do you have a spec?
8. Do programmers have quiet working conditions?
9. Do you use the best tools money can buy?
10. Do you have testers?
11. Do new candidates write code during their interview?
12. Do you do hallway usability testing?
The beauty of the Joel test is that in a relatively small number of yes or no questions, you can get the answers you are looking for. There are no exceptions or caveats that need to be taken into account, it’s purely yes or no to each question and you have your results. A score of 12 is perfect, 11 tolerable and 10 or less is a failure. This beauty did not go unnoticed by me or by a few of my fellow students. I quickly used it to gauge the quality of companies I was interviewing at for internships and later for my first job out of university. Most employers had never heard of the test, which wasn’t surprising. What was surprising was how many companies, which I had previously thought were good, failed the test.
Was I getting the questions wrong? Was I not making them clear? Maybe I just couldn’t count properly all the way to twelve?
The test which I had been told was effectively the holy grail of questions in response to “Is there anything you’d like to ask us?”, was effectively a failure to me. Sure, it might just be that all the companies I interviewed at were terrible, but despite being new to the software world, I had a feeling that wasn’t the case. While I did take the test into consideration for my various applications, it was never a deciding factor for anywhere.
So what went wrong with the Joel test for me?
Each programmer is different. They will have different views about what is good and what is bad, and these views will change over time. The Joel test tries to be unbiased in this and asks objective questions, rather than subjective ones. No programmer out there is going to disagree that tracking bugs is a good idea. But what about some of the other questions?
Created in 2000, the Joel test is now 17 years old. The test is older than OS X. When it was created, Windows 2000 was state of the art, the Playstation 2 had just been released, and Pentium III chips were whizzing along at 1GHz. In the computing world, it was an eternity ago. As power of computer processors (and generally hardware) was doubling every 18 months, our development processes were improving too. The questions in the Joel test reflected the epitome of software development at the time. In 2017 though, they just aren’t quite as relevant.
Taking question #1 as an example:
Do you use source control?
At the time this was written Git, today’s most popular VCS, didn’t exist. Well sure, we all know that Git didn’t appear on the scene until 2005. What’s easy to forget is that Subversion, the previous King of the VCS world, wasn’t released until October 2000. That’s 2 months after the Joel test was originally published. Source control existed, but it was in no way as common as it is today. Does that mean that we should even bother asking this question any more? Perhaps. Should we modify the question to better suit today’s world? Probably.
Let’s take a look at each question in turn to see how it fits into today’s world.
Despite picking on it above, this question still seems relevant in today’s world. All teams should use source control, there is no reason not to. Today’s debates revolve around whether we should be using a distributed VCS or not. Normal projects can use Git or Mercurial (as examples) just fine. Larger companies with enormous repositories find that they need to use a centralised version control system. There isn’t a right or wrong answer to this. What’s important is that version control is used. However, given that this is so common, I’m not convinced that this question has a place today.
Proposed update: Remove rule.
Joel clarifies this one by stating:
By this I mean: how many steps does it take to make a shipping build from the latest source snapshot?
The reasoning behind it still rings true today. However, I do feel that this is our first contender for an out of date question. I’d propose that making a build should be done in zero steps. Continuous integration now exists and and is ubiquitous. It is wise, and indeed sane, to use it. The process of creating a ship build should be that they commit their code, and then when management, or whoever is responsible, decides to release, they press a “release” button, and that’s it. All developers need to do is make sure their code is committed and the CI server takes care of the rest.
CI has brought the development industry forward by leaps and bounds. It can carry us even further if we all use it, and use it to its fullest potential.
Proposed update: Are all builds handled automatically by a Continuous Integration server?
This follows on with #2. Daily builds are fine if that’s what works for your team. I’d say that daily should be the minimum bar though. A better approach, in my eyes, is to make sure that builds are created for each commit to your main development branch. This is a sort of “bleeding edge” kind of build, but it means you will find your issues faster, and more effectively since it’s easier to pin down the version when something broke.
This build should be done by your CI. There is a second factor to this though. Not only should you make daily builds, but you should use them as well. This is, admittedly, harder when you work on software which isn’t related to your day to day life (either personally or professionally). In the case where you write an accounting app, or a controller for a rocket, it’s unlikely that you will need to use these products. However, I think it’s a good idea to sit down with your software for at least 20 minutes a day and just use it, making sure that everything still works. In the case where you do work on something you can use1 day to day, this is even better. It fulfils the same role as a 20 minute session, but in practice, you are going to do a lot more than 20 minutes of testing, and will expose the project to more real-world scenarios.
Proposed update: Do you make and use daily builds?
Well of course you should use a bug database of some kind. Have you ever met a developer who thinks that you shouldn’t? Where this one changes is that today our bug trackers are a lot more fully featured. No longer are they just simple interfaces to a table in a database, they are full featured planning and management applications, often with a key distinction between tasks and bugs (a sneak peek at #5).
When you create a bug today, you give it a title, description, repro steps, affected platforms, etc. just as you’ve always done. Where it goes differently, is that now, management are going to go through your bugs, assign them to sprints, you are going to move things around on a Kanban board, your Scrum master is going to have conversations with you about difficult tickets, etc.
These issue trackers are no longer just about keeping a record of what needs done. They are about who does it, how it affects the team, what effect it has on the product and a hundred other variables. Now, while I believe you should have some sort of software that can do this, it isn’t essential that it is one and the same with your issue tracker. Nor is it something which gives a simple yes or no answer due to the fact that every team is different. The only minor change I’d make, is to word it slightly differently to reflect the capabilities that we have today.
Proposed update: Do you use an issue tracker?
Joel has a great explanation of why this is so important, and 17 years on, it still holds true.
Proposed update: N/A
As a developer, you might not care so much about this. Why does it matter if your team doesn’t meet your deadline? It wasn’t your fault? Leave this to management.
Leaving it to management might work, but it probably won’t. Today, with the various different agile methodologies we have, getting estimates from developers, or at least someone who is capable of calculating a reasonable estimate, is one of the key factors in planning. It helps prioritise what the team is going to work on, who is going to do it, and when they are going to do it. You can stick your fingers in your ears and pretend that it has nothing to do with you, but at the end of the day, each and every developer has a responsibility to the team.
Does it still hold today though? Definitely. Sure, we don’t use waterfall any longer, and making sure we hit that testing phase on time doesn’t make much sense any longer, but even in agile (which lots of people believe to mean “making it up as we go along”), and indeed, especially in agile, the schedule is the master and needs to be kept up to date. The difference today is that the schedule can change to accommodate new information.
Proposed update: N/A
While the schedule rule may have passed through unscathed, the same thing can’t be said for the spec unfortunately. In the waterfall days, the spec was crucial as you needed to know what you were building. If you didn’t you couldn’t allocate resources effectively, and your project was, effectively, doomed to failure.
Back in 2001, when you created your program, whether it be for a coffee maker, a rocket, or a just a straight forward CRUD interface, you knew what had to be done, and how much would it would take to do it approximately. Developers, testers and PMs were assigned, they worked through it all, tested, and shipped, before moving on to the next project. Today though, things are a little different in a lot of cases. Sure, if you are making any of the above, things are still similar, but what about if you are writing a Twitter client? Or a text editor? Maybe even an email client? Are any of these ever “done”? Usually not.
A spec works when you can get all the information up front, and plan things out. When you are working on a continual piece of software, the spec is just as important for your MVP, but after that it is effectively useless. In order for the spec to be as important in today’s world, you need to make sure that you are continually taking into account the latest information and adjusting your process and product accordingly.
This information can take many forms. It might just be simple telemetry about what people are doing with the app, or it could, and possibly should, be more advanced, such as the results of A/B testing new features and designs.
Proposed update: Do you have up to date information on your products performance and usage?
Images of software development in the 80s and 90s bring to mind a sea of cubicles. Today, the result is pretty similar, the only difference is that the dividers are often much smaller, or missing entirely, in the interests of having an “open” office.
Personally, I’m not a fan of the open office. I like having an office of my own where I can have dedicated development time, I can personalise it how I like, and I can turn up my music when I want to get into the zone. However, not everyone agrees with me. I know many people who like the idea of open offices. They promote more effective and direct communication between members of a team, often leading to faster and better solutions to problems.
The rule though asks about quiet working conditions, not solitary ones. Open offices, by their very nature, tend to be noisier than having your own office. When asked about the noise issues of open offices, the usual response is to just wear headphones. With noise cancelling technology, this is a reasonable fix in most cases, and the one I use often when I’m in a shared office.
Like it or not, quiet working conditions are generally tied to having your own office vs having an open office. There is no right or wrong answer for which is correct.
Proposed update: Remove rule
This rule was the hardest to come to a decision on. I actually ended up writing most of the rest of this post before coming back and deciding on this one.
The issue that I had with it is that often, companies can’t afford the best tools money can buy. Just because they can’t today, doesn’t mean they won’t be able to tomorrow. However, I think the spirit of the rule still stands. It’s the duty of the company to make the lives of the developers as easy and as happy as possible. Happy and unstressed developers product better code.
One key point though to remember about this is that this does not mean that money needs to be thrown at the problem. If the best tool is free, then that’s the one that should be used.
Proposed update: N/A
Testers are an area that every company does differently. A couple of years ago, between when I was an intern and when I went full time, Microsoft merged the engineer and tester roles. I originally wrote about why I thought that merging testers and developers was a good thing due to the resources I wasted as an intern, and how removing the safety net of testers encouraged me to write better code. Now, that was true for Microsoft, but let’s look at what Joel has to say about this:
If your team doesn’t have dedicated testers, at least one for every two or three programmers, you are either shipping buggy products, or you’re wasting money by having $100/hour programmers do work that can be done by $30/hour testers.
The main difference between how I viewed this and how Joel does is that at Microsoft, testers and developers were equal. Developers weren’t getting paid 3 times as much as testers. The time a tester spent going over changes, and making sure fixes worked, and didn’t introduce problems, was just as valuable as a developers time. However, it still felt like a safety net.
Having testers is a great thing, but Joel makes the best point, and that is make sure you aren’t wasting the value of your testers.
Developers need to take on the responsibility for their own code. If you aren’t very confident that you code is bug free, it shouldn’t be checked in. That means you need to run manual tests, unit tests, integration tests, etc. You aren’t going to be checking in code without going through code review, so why check in without going through tests?
Testers don’t need to go through each and every change, but they do need to make sure the product as a whole is still stable. They should be able to spend their time going through a test matrix that is far larger than would be worth a developer doing. Having testers isn’t they key, having a comprehensive and efficient test plan is.
Proposed update: Do you have a comprehensive test plan?
If you were to search for the keyword “interview” on various development sites, such as Hacker News, or some sub-reddits, how many of these resulting articles would have something positive to say about the process? Very few. Everyone is aware that the current interview system is broken and we need a better one. Until the time that someone can come up with one, assuming that a single one exists, which is unlikely, we need to work with what we have.
There are a few different ways of handing interviews currently:
The thing that they all have in common is that the developer is expected to write code. If that’s what their job is going to be, that’s what you should test them on. Don’t penalise them for getting the syntax of a particular operation wrong, or not choosing the language that your team works in, but make sure they can write code.
As we have seen before though, this one is so ubiquitous that I don’t think it’s worth including any longer.
Proposed update: Remove rule.
As one of the lesser known rules, let’s take a look at Joel’s definition:
A hallway usability test is where you grab the next person that passes by in the hallway and force them to try to use the code you just wrote. If you do this to five people, you will learn 95% of what there is to learn about usability problems in your code.
Now, you aren’t going to hear any argument from me that hallway usability testing is a good thing, but things have come a long way from four thousand grey buttons on a grey tool bar, in a grey window on a grey desktop. People who use computers (which has a very wide range of definitions these days), aren’t necessarily well versed in how they work. For specialised tools, having what you need at your finger tips with buttons, drop-downs, keyboard shortcuts, etc. is still great. For Facebook however, your grandparents need to be able to use it. It needs to have a clear design, with high contrast between elements, have interaction points located in sensible places, and be suitably sized.
Creating a user interface and experience of any kind, and by that I don’t just mean the regular UI, but making sure things like saving a file are handled in a sensible manner, is no longer something that any developer can do, or should be expected to do. It is now a specialised skill, which should be handled by dedicated UI and UX designers.
Proposed update: Do you have dedicated UI and UX designers?
So, let’s see what our test now looks like:
1. Are all builds handled automatically by a Continuous Integration server?
2. Do you make and use daily builds?
3. Do you use an issue tracker?
4. Do you fix bugs before writing new code?
5. Do you have an up-to-date schedule?
6. Do you have up to date information on your products performance and usage?
7. Do you use the best tools money can buy?
8. Do you have a comprehensive test plan?
9. Do you have dedicated UI and UX designers?
We’ve cut a few rules, and improved others. But we aren’t done yet. This was how we can change the existing test to be more suitable for modern development, but we need to add in new rules that matter today.
Everyone makes mistakes. It doesn’t matter if you are fresh out of school or a war hardened veteran of software, you will make mistakes. All too often we depend on the compiler to tell us about our mistakes, and that’s not how it works. That’s why we test. But testing is fallible too. We can’t rely 100% on it either. Having a fresh pair of eyes on your code will see things that you simply didn’t see. Further more, they can help you make sure your code follows your teams style, which is something your compiler can’t do (linters are a topic for a different time). Having consistent code helps others on your team read and use your code.
This one goes hand in hand with the one above. Writing code is all well and good, but the trick is that you need to be able to read it again. Unfortunately, this isn’t always the case, and we should do what we can to ensure that code is as easy to read as possible. The number one way of doing that, is ensuring the code is consistent in the code base. By having a set of standards which the team must adhere to, it makes it much easier to read and understand code, search through the code base for particular components, types, etc., and helps new developers get up to speed by clearly defining the style expected of them, rather then forcing them to figure it out as they go. This can also be paired with a linter in your CI system to enforce the standards.
It seems obvious, but it just isn’t in lots of places. Many new developers join a company and get thrown in at the deep end. This process is a waste of time for the developer and the company. Some will manage to swim, but most will sink. When you join a company, you should be given the information you require to do your job. Now, I’m not saying that you should be sat down in a conference room for a week to sit through HR sessions and proclamations from VPs, and in fact I hope that this isn’t the case, but you should be given the documentation for the system you work on, and have someone already on your team assigned as a mentor for a reasonable time period so that you can get up to speed as quickly as possible. Yes, that means that the other developer will have a lower output during the training session, but after that you get two developers, so anyone who can do basic arithmetic can see that it is a worthwhile investment.
So, lets see what our new “Joel Test” for 2017 looks like:
1. Are all builds handled automatically by a Continuous Integration server?
2. Do you make and use daily builds?
3. Do you use an issue tracker?
4. Do you fix bugs before writing new code?
5. Do you have an up-to-date schedule?
6. Do you have up to date information on your products performance and usage?
7. Do you use the best tools money can buy?
8. Do you have a comprehensive test plan?
9. Do you have dedicated UI and UX designers?
10. Does all code go through code review?
11. Do you have coding standards?
12. Are new employees given training?
It’s perfect! We shall all use it indiscriminately and rejoice.
Except no… The test isn’t a panacea. It’s just a guide. It’s a flag system rather than a thorough evaluation. A company can get a 12/12 score and still be a terrible place to work. However, a company that doesn’t get at least 10 is effectively guaranteed to be a terrible place to work. Make sure your potential employers get a high score of 11 or 12, and then use your own judgement. Don’t go in blind.
P.S. I’ve had this post sitting in my drafts for quite a while. Yesterday Stack Overflow mentioned that they are asking similar questions of themselves. Since there isn’t a right or wrong answer to this sort of thing (including what I wrote above - it could be totally wrong for you and your situation), I figured I’d get mine out, and hopefully prompt people to start thinking about this so that we can form a better idea as whole industry rather than as a lone developer.
It turns out that working on an email client, bugs tend to be found pretty quickly. Who would have guessed that developers, particularly at large companies like Microsoft, would use email quite often? ↩
Sticking with the Office add-ins example, here is how you can rewrite a call to a script to one you have locally for easy debugging. (If you don’t already have it, you can download Fiddler here)
Fiddler should do everything you need out of the box with one minor exception: SSL (or TLS hopefully) requests. In order to make it work with these requests, there is an extra step. Open Fiddler Options
from the Tools
menu, and select the HTTPS
tab. Make sure that the Decrypt HTTPS traffic
box is ticket and click OK
. Fiddler can now intercept your HTTPS traffic and decrypt it. The problem is that you machine will, hopefully, reject these requests due to being Man in the middled. In order to ensure that your device will accept these certificates, the Fiddler certificate needs to be added to your store. This varies from device to device, but the gist of it is you will use the device which is making the connection you want to rewrite part of, browse to X.X.X.X:8888
in your web browser, where X.X.X.X
is the IP address of the machine running Fiddler, and chose to download the FiddlerRoot certificate
towards the bottom of that page. More details about how to handle this for your device or browser can be found here.
In the right hand pane, select the AutoResponder
tab at the top. Make sure Enable rules
and Unmatched requests passthrough
are checked. Now, click Add Rule
. At the bottom of the pane you will see a section named Rule Editor
. For the first text field, you are going to enter EXACT:https://the.url.of/my/script/here.js
, then for the bottom field, you have two choices. You can return a local file, or a response.
For the local file, just select Find a file...
in the dropdown, browse to the file you want to return, and select Open
. Then click Save
on the right hand side.
If you want to return a different file hosted somewhere else (maybe a newer version of a script or similar), simple enter the URL of the file in the bottom field, and click Save
There isn’t one. That’s it. Make sure your device is set to use the machine running Fiddler as a proxy (remember it’s on port 8888), and you are good to go. If it doesn’t work quite as expected, make sure files aren’t being returned from the cache.
]]>Always use HTTPS - It’s 2016. You have no excuse. None.
Ok, now that we know to use HTTPS, why is hashing on the client a bad thing? Put simply, when you hash on the client, the hash becomes the password. Instead of authenticating with the server by sending a passphrase, you authenticate by sending a hash. This means that your password search space is reduced to the hash. Thankfully, with modern hashes this isn’t really the end of the world as the lengths are usually enough to still provide reasonable security. The problem is that it is suddenly much easier to brute force as they don’t need to run through the algorithms, just the hashes.
The second reason is the most important. If the server is compromised, all the hashes are available to everyone. Since only the hash needs to be sent to the server, whoever has this list now has access to all accounts, as though the passwords were stored in plain text (which the basically were).
So that’s why you shouldn’t hash passwords on the client.
]]>