Friday, February 15, 2008

The busy Java developer's guide to Scala: Packages and access modifiers

Recently, reader feedback has led me to realize that I have left behind an important aspect of Scala's language while crafting this series: Scala's package and access modifier facilities. So I'm going to take a moment and cover this before diving into one of the more functional elements of the language, the apply mechanism.


To help segregate code in such a way that it doesn't conflict with one another, Java™ code provides the package keyword, creating a lexical namespace in which classes are declared. In essence, putting a class Foo in a package named com.tedneward.util modifies the formal class name to com.tedneward.util.Foo; it must be referenced as such. Java programmers will be quick to point out that they don't do this, they import the package and thus save themselves from having to type the formal name out. This is true, but it merely means that the work of referencing the class by its formal name falls to the compiler and bytecode. A quick glance at javap output reveals this to be the case.

Packages in the Java language have a few quirks to them, however: The package declaration must appear at the top of the .java file in which the package-scoped classes appear (which causes some serious havoc with the language when trying to apply annotations to the package); the declaration holds scope across the entire file. This means that the rare case in which two classes are tightly-coupled across package boundaries has to be split across files, leading the unwary to not recognize the tight coupling between the two.

Scala takes a slightly different approach in respect to packaging, treating it as a combination of the Java language's declaration approach and C#'s scoped approach. With that in mind, a Java developer can do the traditional Java approach and put a package declaration at the top of a .scala file just as normal Java classes do; the package declaration applies across the entire file scope just as it does in Java code. Alternatively, a Scala developer can use Scala's package "scoping" approach in which curly braces delimit the scope of the package statement, as in Listing 1:

Listing 1. Packaging made simple
package com
  package tedneward
    package scala
      package demonstration
        object App
          def main(args : Array[String]) : Unit =
            System.out.println("Howdy, from packaged code!")
            args.foreach((i) => System.out.println("Got " + i) )


Effectively, this code declares one class, App, or to be more precise, a single class called com.tedneward.scala.demonstration.App. Note that Scala also permits the package names to be dot-separated, so Listing 1 could be written more tersely as shown in Listing 2:

Listing 2. Packaging made simple (redux)
package com.tedneward.scala.demonstration
  object App
      def main(args : Array[String]) : Unit =
        System.out.println("Howdy, from packaged code!")
        args.foreach((i) => System.out.println("Got " + i) )


Use whichever style seems more appropriate because they each compile into exactly the same code constructs. (The scalac compile goes ahead and generates the .class files in package-declared subdirectories as javac does.)


Of course, the logical flip-side of packaging is import, Scala's mechanism for bringing names into the current lexical namespace. Readers of this series have already seen import in a couple of samples before now, but it's time for me to point out some of import's features that will come as a surprise to Java developers.

First of all, you can use import anywhere inside the client Scala file, not just at the top of the file and correspondingly, will have scoped relevance. Thus, in Listing 3, the java.math.BigInteger import is scoped entirely to the methods defined inside the object App and nowhere else. If another class or object inside of mathfun wanted to use java.math.BigInteger, it would need to import the class just as App did. Or if several classes in mathfun all wanted to use java.math.BigInteger, the import could occur at the package level outside the definition of App and all of the classes in this package scope will have BigInteger imported.

Listing 3. Import scoping
package com
  package tedneward
    package scala
        // ...
      package mathfun
        object App
          import java.math.BigInteger
          def factorial(arg : BigInteger) : BigInteger =
            if (arg == BigInteger.ZERO) BigInteger.ONE
            else arg multiply (factorial (arg subtract BigInteger.ONE))
          def main(args : Array[String]) : Unit =
            if (args.length > 0)
              System.out.println("factorial " + args(0) +
                " = " + factorial(new BigInteger(args(0))))
              System.out.println("factorial 0 = 1")


Importing doesn't stop there, however. Scala sees no real reason to differentiate between top-level members and nested ones, so you can use import to bring not just nested types into lexical scope, but any member; by importing all of the names inside java.math.BigInteger, for example, you can drop the scoped references to ZERO and ONE to just name references as in Listing 4:

Listing 4. Static imports ... without the static
package com
  package tedneward
    package scala
        // ...
      package mathfun
        object App
          import java.math.BigInteger
          import BigInteger._
          def factorial(arg : BigInteger) : BigInteger =
            if (arg == ZERO) ONE
            else arg multiply (factorial (arg subtract ONE))
          def main(args : Array[String]) : Unit =
            if (args.length > 0)
              System.out.println("factorial " + args(0) +
                " = " + factorial(new BigInteger(args(0))))
              System.out.println("factorial 0 = 1")


By using the underscore (remember the wildcard character in Scala?), you effectively tell the Scala compiler that all of the members inside BigInteger should be brought into scope. And because BigInteger has already been put into scope by the previous import statement, there's no need to explicitly package-qualify the class name. In fact, these could even be combined into a single statement because import can take multiple, comma-separated targets to import (shown in Listing 5):

Listing 5. Bulk imports
package com
  package tedneward
    package scala
        // ...
      package mathfun
        object App
          import java.math.BigInteger, BigInteger._
          def factorial(arg : BigInteger) : BigInteger =
            if (arg == ZERO) ONE
            else arg multiply (factorial (arg subtract ONE))
          def main(args : Array[String]) : Unit =
            if (args.length > 0)
              System.out.println("factorial " + args(0) +
                " = " + factorial(new BigInteger(args(0))))
              System.out.println("factorial 0 = 1")


This saves you a line or two. Note that the two cannot be combined: the first imports the BigInteger class itself and the second, the various members inside that first class.

You can also use import to introduce other non-constant members as well. For example, consider a math utility library (of questionable value perhaps, but still...) in Listing 6:

Listing 6. Enron's accounting code
package com
  package tedneward
    package scala
        // ...
      package mathfun
        object BizarroMath
          def bizplus(a : Int, b : Int) = { a - b }
          def bizminus(a : Int, b : Int) = { a + b }
          def bizmultiply(a : Int, b : Int) = { a / b }
          def bizdivide(a : Int, b : Int) = { a * b }


Using this library could get quite annoying over time, having to type BizarroMath every time one of its members was requested, but Scala allows for each of the members of BizarroMath to be imported into the top-level lexical namespace, almost as if they were global functions (shown in Listing 7):

Listing 7. Calculating Enron's expenses
package com
  package tedneward
    package scala
      package demonstration
        object App2
          def main(args : Array[String]) : Unit =
            import com.tedneward.scala.mathfun.BizarroMath._
            System.out.println("2 + 2 = " + bizplus(2,2))


There are other interesting constructs that would allow a Scala developer to write the more natural 2 bizplus 2, but that will have to wait for another day. (Readers curious about a potentially heavily-abusable Scala feature can look at the Scala implicit construct as covered in Programming in Scala by Odersky, Spoon, and Venners.)


While packaging (and importing) are part of the encapsulation and packaging story in Scala, a large part of it, as with Java code, lies in its ability to restrict access to certain members in a selective way — in other words, in Scala's ability to mark certain members "public," "private," or somewhere in-between.

The Java language has four levels of access: public, private, protected, and package-level access (frustratingly applied by leaving out any keyword). Scala:

  • Does away with package-level qualification (in a way)
  • Uses "public" by default
  • Specifies "private" to mean "accessible only to this scope"

By contrast, "protected" is definitely different from its counterpart in Java code; where a Java protected member is accessible to both subclasses and the package in which the member is defined, Scala chooses to grant access only to subclasses. This means that Scala's version of protected is more restrictive (although arguably more intuitively so) than the Java version.

Where Scala truly steps away from Java code, however, is that access modifiers in Scala can be "qualified" with a package name, indicating a level of access up to which the member may be accessed. For example, if the BizarroMath package wants to grant member access to other members of the same package (but not subclasses), it can use the code in Listing 8 to do so:

Listing 8. Enron's accounting code
package com
  package tedneward
    package scala
        // ...
      package mathfun
        object BizarroMath
          def bizplus(a : Int, b : Int) = { a - b }
          def bizminus(a : Int, b : Int) = { a + b }
          def bizmultiply(a : Int, b : Int) = { a / b }
          def bizdivide(a : Int, b : Int) = { a * b }
              private[mathfun] def bizexp(a : Int, b: Int) = 0


Note the private[mathfun] expression here. In essence, the access modifier is saying that this member is private up to the package mathfun; this means that any member of the package mathfun has access to bizexp but nothing outside of that package does, including subclasses.

The powerful meaning of this is that any package can be declared in the "private" or "protected" declaration all the way up to com (or even _root_ which is an alias for the root namespace, thus essentially making private[_root_] the same thing as "public"). This provides a degree of flexibility in access specification far beyond what the Java language provides.

In fact, Scala offers one more degree of access specification: the object-private specification, illustrated by private[this], which stipulates that the member in question can only be seen by members called on that same object, not from different objects, even if they are of the same type. (This closes a small hole in the Java access specification system that was useful for Java programming interview questions and not much more.)

Note that the access modifiers will have to map on top of the JVM at some level and as a result, some of the subtleties in their definition will be lost when compiled or called from regular Java code. For example, the BizarroMath example above (with the private[mathfun]-declared member bizexp) will generate the class definition in Listing 9 (when viewed with javap):

Listing 9. Enron's accounting library, JVM view
Compiled from "packaging.scala"
public final class com.tedneward.scala.mathfun.BizarroMath
   extends java.lang.Object
    public static final int $tag();
    public static final int bizexp(int, int);
    public static final int bizdivide(int, int);
    public static final int bizmultiply(int, int);
    public static final int bizminus(int, int);
    public static final int bizplus(int, int);


As is obvious from the second line of the compiled BizarroMath class, the bizexp() method was given a JVM-level access specifier of public which means that the subtle private[mathfun] distinction was lost once the Scala compiler was finished with its access checks. As a result, for Scala code that is intended to be used from Java code, I'd prefer to stick with the traditional "private" and "public" definitions. (Even "protected" will sometimes end up mapping to JVM-level "public," so when in doubt, consult javap against the actual compiled bytecode to be certain of its access level.)


In the preceding article in the series ("Collection types"), when talking about arrays in Scala (Array[T]s to be exact) I said, "obtaining the i'th element of the array" was in fact "another one of those methods with funny names...." As it turns out, although I didn't want to get into the details then, this wasn't exactly true.

OK, I admit it, I lied.

Technically, the use of the parentheses against the Array[T] class is a tad bit more complicated than simply a "method with a funny name"; Scala reserves a particular nomenclature association for that particular sequence of characters (that being the left-parens-right-parens sequence) because that is so often used with a particular intent in mind: that of "doing" something (or in functionalspeak, "applying" something to something).

In other words, Scala has a special syntax (more accurately, a special syntactic relationship) in place for the "application" operator "()". To be precise, Scala recognizes the method called apply() as the method to invoke when said object is invoked using () as the method call. For example, a class that wants to behave as a functor (an object that acts as a function) can define an apply method to provide function- or method-like semantics:

Listing 10. Play that Functor music, code boy!
class ApplyTest
  import org.junit._, Assert._  
  @Test def simpleApply =
    class Functor
      def apply() : String =
        "Doing something without arguments"
      def apply(i : Int) : String =
        if (i == 0)
          "Applying... " + apply(i - 1)

    val f = new Functor
    assertEquals("Doing something without arguments", f() )
    assertEquals("Applying... Applying... Applying... Done", f(3))


Curious readers will be wondering what makes a functor different from an anonymous function or closure. As it turns out, the relationship is fairly obvious: The Function1 type in the standard Scala library (meaning a function that takes one parameter) has an apply method on its definition. A quick glance through some of the generated Scala anonymous classes for Scala anonymous functions will reveal that the generated classes are descendants of Function1 (or Function2 or Function3, depending on how many parameters the function takes).

This means that where anonymous or named functions don't necessarily fit the design approach desired, a Scala developer can create a functor class, provide it with some initialization data stored in fields, and then execute it via () without any common base class required (as would be the case for a traditional Strategy pattern implementation):

Listing 11. I said 'play that Functor music, code boy!'
class ApplyTest
  import org.junit._, Assert._  

  // ...
  @Test def functorStrategy =
    class GoodAdder
      def apply(lhs : Int, rhs : Int) : Int = lhs + rhs
    class BadAdder(inflateResults : Int)
      def apply(lhs : Int, rhs : Int) : Int = lhs + rhs * inflateResults

    val calculator = new GoodAdder
    assertEquals(4, calculator(2, 2))
    val enronAccountant = new BadAdder(50)
    assertEquals(102, enronAccountant(2, 2))


Any class that provides the appropriately-argumented apply method will work when called as long as the arguments line up in number and in type.


Scala's packaging, import, and access modifier mechanisms provide the fine degree of control and encapsulation that a traditional Java programmer has never enjoyed. For example, they offer the ability to import select methods of an object, making them appear as global methods, without the traditional drawbacks of global methods; they make working with those methods exceedingly easy, particularly if those methods are methods that provide higher-order functionality, such as the fictitious tryWithLogging function introduced earlier in this series ("Don't get thrown for a loop!").

Similarly, the "application" mechanism allows Scala to hide execution details behind a functional facade such that programmers may not even know (or care) that the thing they are invoking isn't actually a function but an object imbued with significant complexity. The mechanism provides another dimension to Scala's functional nature, one which can certainly be done from the Java language (or C# or C++ for that matter) but not with the degree of syntactic purity Scala provides.

That's it for this installment; until next time, enjoy!

No comments: