Working with Collections Using Blocks – Introducing MCSCollectionUtility Repository

Working with collections seems to be one of the most common tasks for a programmer. As a result, every modern language tries to make this as easy and clean as possible. Fast enumeration and blocks are great examples of how language support can make collection operations cleaner.

When it comes to blocks, I think there’s still a lot of potential we can tap into to give a boost to Cocoa’s collections. To address that particular issue I created categories for NSArray, NSDictionary and NSSet and I would like to take this opportunity to introduce them and the ideas behind them. MCSCollectionUtility repository is available on github here.

Functional Programming

Before we move any further I would like to quickly introduce the concept of functional programming as it is related to the way we would like to perform operations on collections.

It is a programming paradigm that tends to treat computations as evaluation of mathematical functions. It tends to avoid retaining state or creating mutable data, which results in code that’s often easier to analyze. Moreover, functional programming is truly eye-opening. Using it to write even the most obvious program can be pretty challenging — especially in the beginning.

I really encourage everyone intent on broadening their horizons to try to learn functional programming. As always, there are multiple Web resources which are great way to start. There are many functional languages to choose from, too. I think that because of its really simple syntax LISP is a pretty good place to start, even though it is rarely used nowadays. When it comes to learning LISP, there is this great book called “Structure and Interpretation of Computer Programs” available as open source for free here.

Operations on Collections

Although Apple updated public APIs of many of their frameworks to use blocks, it seems that collections still don’t utilize the potential of blocks to the fullest possible extent.

I don’t like to explicitly use loops every time I want to perform an action on elements of an array — especially when its logic is really simple. Repeated use of the same enumeration code in a single project is not good programming practice, either. Back when there were no blocks in Objective C we were pretty much limited to loops but now, well, we have blocks.

Apple’s enumerateObjectsUsingBlock: and enumerateObjectsWithOptions:usingBlock: methods are great examples of how we can play with collections. But as great as they are, using them often requires the programmer to write too much of the boilerplate code.

In general, operations on loops can be easily categorized into a few groups and therefore it would be really helpful to have some template methods to support operations from all of these groups. Although these methods should be really plain, combining them would provide us with a tool to perform complex operations on data.

Introducing the Categories

For purposes of this post, I created a github repository which contains categories for the NSArray, NSSet, and NSDictionary classes. The idea behind this repository is simple – to work with collections using as few lines of code as possible.

I try to achieve this goal through extensive use of blocks and the application of the functional programming approach. The repository is open source and available on github here. All of the methods that appear in the repository are pretty well documented in the ‘README‘ file, so getting to know them shouldn’t be a problem.

Traditional Approaches

In this paragraph, I would like to show how operations on collections can look, using some of the more traditional approaches available for programmers using the Foundation framework API.

The task is to filter Person entities for people between ages of 20 and 30, sort them by their last name, take 10 of them and check whether any of them has the first name ‘Steve’.

Putting something like that together using only for statements, either traditional ones or those connected to fast enumeration, would require us to write something like this:

 NSMutableArray *proceedPeople = [NSMutableArray array];

 for (Person *person in people) {
    if (person.age > 20 && person.age < 30) {
        [proceedPeople addObject:person];
    }
 } 

 [proceedPeople sortedArrayUsingSelector:@selector(someSelectorWhichSortsPeopleUsingLastName:)];

 NSRange range = NSMakeRange(0, [proceedPeople count] > 10 ? 10 : [proceedPeople count]);
 proceedPeople = [proceedPeople subarrayWithRange:range];

 BOOL isSteveHere = NO;
 for (Person *person in people) {

    if ([person.firstName isEqualToString:@"Steve"]) {
        isSteveHere = YES;
        break;
    }
 }

There will be people who will argue that we should use NSPredicate here and thus reduce the amount of code we need to achieve our goal, so let’s just roll with it:

 NSPredicate *predicate = [NSPredicate predicateUsingFormat:@"self.age > 20 AND self.age < 30"];
 proceedPeople = [people filteredArrayUsingPredicate:predicate];

 NSRange range = NSMakeRange(0, [proceedPeople count] > 10 ? 10 : [proceedPeople count]);
 proceedPeople = [proceedPeople subarrayWithRange:range];

 predicate = [NSPredicate predicateWithString:@"self.name == 'Steve'"];
 BOOL isSteveHere = [[proceedPeople filteredArrayUsingPredicate:predicate] count];

I think using predicates has at least two downsides which seriously limit their usefulness.

First of all, NSPredicates aren’t as universal as blocks, and because of that they often need to be combined with standard loops, etc., to achieve the desired results. This doesn’t look good and it isn’t obvious enough for anyone reading through it.

Secondly, my opinion is that the creation of NSPredicates is far from being perfect. For me, the best way to create NSPredicate is by using predicateWithBlock: initializer. Using a block to create a predicate which will be later used to work with collection is obviously a more protracted approach than simply using blocks to work with collections directly. We can also use predicateWithFormat: to create the NSPredicate we want. As always, expressing program logic with strings is quite unsafe as the compiler is nearly powerless when it comes to checking whether our query has any sense at all. Initialization with strings is also pretty dangerous due to the fact that names of our properties can change, causing the predicate’s string to be invalid – an error the compiler would not be able to detect.

An Approach Using a Functional-like API

Let’s try to implement a solution for the problem described in the previous paragraph using the API available in the repository I mentioned above.

The code will look something like this:

BOOL isSteveHere = 
[[[[objects mcs_select:^id(Person *person) {
     return person.age > 20 && person.age < 30 ? person : nil;

}] sortedArrayUsingComparator:^NSComparisonResult(Person *person1, Person *person2) {
     return [person1.firstName compare:person2.firstName];

}] mcs_take:10]
 mcs_any:^BOOL(Person *person) {
     return [person.firstName isEqualToString:@"Steve"];

}];

I think the code snippet above clearly demonstrates what I meant when I wrote functional programming is all about evaluating functions. The entire snippet looks like one big expression, comprising multiple subfunctions that each return the middle result. Most importantly, it is clear and consistent. One can easily say what this expression does after a quick glance because the method names are highly descriptive.

Every subsequent call to the method performs some specific operations whose result is returned and used as the argument of the next method. Just like when using pipes in UNIX shell enables you to perform really powerful tasks composed of a few smaller ones.

ls | grep file

Why do I think that this code is cleaner? Well, it looks good to me, but that’s a highly subjective opinion and can be easily discarded. Much more significant is the fact that there are no objects (especially mutable ones) whose state is changed throughout the entire operation. Everything is hidden from the user of the API in the implementation of the methods, so overall, the possibility that errors will occur is smaller.

Why do I think the code is consistent? Implementation of the presented operation (and many, many others) is possible using an API that employs blocks. There is no need to combine a couple ofAPIs which are totally different when it comes to the way we use them. The best example of this is combining predicates with loops enumeration.

Summary

In my opinion, operations on collections using blocks are a really nice addition to the standard collections API available in the Foundation framework. Their simplicity and expressiveness makes code a lot more transparent to the reader and less error-prone.

As always, there was a problem with deciding whether a particular method should be available for a given class. The list of the methods implemented in the created repository isn’t closed, so if you have any suggestions for new methods, feel free to comment.

Quartz Composer Going Wireless
GCD Dispatch Groups With an Additional Level of Inception