Tuesday, January 21, 2020

Understanding versus working state

I'm noticing a prevalent trend in development, and particularly DevOps as a new discipline that is very problematic.  It seems that most developers (and operations folks/DevOps folks), are valuing working state over systemic understanding.  That is, people at the command line will continue to cut and paste solutions from Azure documentation, or StackOverflow until it works.  At the end of it; they probably won't have much of an idea of why it's all working.  This leads to a system that is very fragile.  It works today, but tomorrow could bring breaking changes, and nobody will have any idea why they broke the system, or how to fix it!

I'm noticing that the increasing trend in online documentation is to present solution based documentation, which is useful in a pinch, but ultimately very counter-productive.  The same is starting to be true of books now too, if one even exists in the domain you're looking for that isn't just a reprint of the online documentation, and thusly has very little explanation.

So I'm going to do my best here to write up an account of working into a new Azure account, and getting things working.   As a developer/CTO who is interested first and foremost in system longevity , and reproducibility; I will focus not just on code examples but also on explanation of what is going on (as far as I understand it).

Thursday, November 29, 2018

Future[T] and Future[Try[T]] - Round Two!

In my last post, I was discussing some of the issues I'm facing around Future[T]; and how if you look inside the implementation of Future, what you see is

val value: Option[Try[T]].

Of course, for my part, I want access to that Juicy Try[T].  without it, flows can just fail; another point in case, the post-process code in our system:

for {
  flowDone <- flow.runWith(Sink.foreach(e => logger.trace("Updated reference " + e)))
  catDone <- categoryService.updateCategoryListWithProductCount(loginId)
  childSkus <- backPropagateChildSkus(loginId)
  reIndexDone <- searchReindexerService.callReIndex(loginId)(ws)
} yield reIndexDone

Once the product ingest work is done, I then want to chain that future through a series including updating category classifications, updating Child Sku information and sending the whole lot off wholesale for reindexing in the search cluster.

With the current implementation and function of Future, even if these methods all return Future[Try[T]], any one of them might fail for an unpredictable reason, and generate a Failure[Try[T]] where the Failure's Try is Failure, and the inner value is never exposed.  And in fact, on the first execution this is precisely what happened.  Another exception was swallowed I suspect, because whilst all the products seem to have been inserted, the reindex started as indicated on logging, but didn't complete successfully.

Something must be done!

Attempts to do something very brute force like subvert Future[T] to return Future[Try[T]] in all cases seem dubious; particularly as we now have the problem of dealing with somehow making sure Future[Try[T]] still returns Future[Try[T]] and not Future[Try[Try[T]].  My type foo is not strong enough to untangle that one; and perhaps that's a good thing.

So having arrived at that conclusion, what do I want to do about it.  I think I'm going to give this a spin:

class FutureTryAutoTransformer[T](original: Future[Try[T]])(implicit executionContext: ExecutionContext) {
  val liftF: Try[Try[T]] => Try[Try[T]] = {
    case Success(Success(v)) => Success(Success(v))
    case Success(Failure(e)) => Success(Failure(e))
    case Failure(e) => Success(Failure(e))
  }

  def lift: Future[Try[T]] = original.transform[Try[T]](liftF)(executionContext)
}

implicit def future2FutureTryAutoTransformer[T](
  f: Future[Try[T]])(implicit executionContext: ExecutionContext) = 
   new FutureTryAutoTransformer[T](f)

With that little piece of magic, I have a way to "lift" the Future[Try[Try[T]] up so that the inner and out Try components are handled together and I get a real Future[Try[Try[T]] with no hidden agendas!!

let's see how we feel about that - I've heard tell of a scalaz solution which I might look in to... watch this space!

Tuesday, November 27, 2018

On Future[T] and Future[Try[T]]

In the Scala universe there is some debate of the usage of Future[Try[T]], and how to best encapsulate failure in the context of a Future.  For my part, I like using Monads to communicate context, and meaning of the expectation, especially around failure.  One of the biggest reasons for the existence of Option[T] is to pro-actively handle null cases; and with Try[T], the same thing with exceptions (noting that Try is not technically Monadic, but... it's close).  This becomes especially bothering once you drop into an Akka streams situation with flows where errors can easily just get eaten by the system completely, with no exception trace or notification.  I have one application where it ingests millions of rows, and occasionally the flow blows up, and what do you see on the logging?  Nothing at all.

So - how can you at least address this situation; if you're like me, and like explicit understanding based on your Type arrows; how can you wrap this up in a way that gives you the context you desire:


  def tryingFutTry[T](f: => Future[Try[T]])(implicit executionContext: ExecutionContext): Future[Try[T]] = 
    try {
      f.recoverWith({ case e: Throwable => Future(Failure(e))})
    }
    catch {
      case e: Exception => Future(Failure(e))
    }

  def tryingFut[T](f: => Future[T])(implicit executionContext: ExecutionContext): Future[Try[T]] = 
    try {
      f.map[Try[T]](Success.apply).recoverWith({ case e: Throwable => Future(Failure(e))})
    }
    catch {
      case e: Exception => Future(Failure(e))
    }

  def trying[T](f: => T)(implicit executionContext: ExecutionContext): Try[T] = try {
    Success(f)
  }
  catch {
    case e: Exception => Failure(e)
  }
Whilst this isn't very pretty, or perhaps even very well named; I'm not loving it yet; it does at least give you a way to "lift" non-try wrapped Future in a Try context to lift out the failure case from inside the Future that gets eaten, and allow you to expose swallowed exceptions when you have an explicit Try context that may not be catching all exceptions.

A lot of these ideas were take from a blog post that I found here:
https://alvinalexander.com/scala/how-exceptions-work-scala-futures-oncomplete-failure

Wednesday, April 11, 2018

On Good Code

Good code.  What makes a codebase good?  What makes good code... well, good?

Coming into a new company again has refreshed my mind on what it is like to delve into a complex pre-existing codebase for the first time.  Sometimes the experience is agonizing, sometimes it's fairly straightforward, and sometimes it lies somewhere in between.

I remember when I first started at PlayStation, who are for the most part, a Java shop.  Getting my environment set-up, discovering the shape of things, where to find things, what libraries were there and beginning to dig into the project I'd be working on.  Opening up my IDE for the first time and initializing the first maven pom into a project for IntelliJ to index and for me to digest.

I pulled out my OmniGraffle, and started making diagrams for my own edification, tracing from the start of the application flow, where requests arrive, and following the call flow all the way down into the guts of the application where SQL queries start popping up and the particulars of the relational organization of the platform become evident, making flow charts and relational diagrams as I go so I can understand what kind of a picture it all paints.

Today, coming into a new company, I begin the process all over again, but this time I'm in a director role, so today, my chief concern is more to do with deliverables and project timelines, overarching technology goals and direction that it is with the low level of our implementation.  However, being a startup, it means that whilst my primary focus is more long-term, I have to be conversant with all those gruesome details, and able to function therein.  It's PHP, it's AWS, it's Docker, it's Linux, it's MySQL.

My standards for what "good code" looks like are very high.  Over the years I've discovered this in my professional life, and that most people aren't nearly as exacting as my preferences would prefer.  It's been a long journey from the young up-and-commer who left Southampton University for the U.S.A with a baby on the way and dived into commercial software development with gusto to today, a Engineering Director for a small company in California.  Over those years I've worked on many systems, from the smallest companies like ZiftIt, where the technology team consisted of me doing all the core engineering, a front-end guy doing UI, and the CTO, to vast sprawling multi-billion dollar organizations like Sony PlayStation where I was just one cog in a vast machine delivering content at massive scale to users the world over.

One common thread that shows up across these companies is that good code makes a difference.  Not in theory, but in practice.  I've lived in companies where the toxicity of the codebase rose up and strangled the organization from within as it took more engineers just to beat back the zombies and skeletons of rushed implementations, where interacting with system became an effort in managing the edge cases that were so prevalent it was like trying to play patty-cake with Edward Scissorhands, and the edge-cases were so much at the edges as baked all the way through.

Let me start by describing what good code feels like.  When your codebase is good, it feels safe.  It's a warm blanket that welcomes you to work in the morning, where you feel confident that your timelines are accurate.  Where you can estimate with ease, and new features are just a matter of solving for the complexity of the design.  Where you go home on Friday, and thinking about refactoring something, arrive Monday, and the refactoring is done by close of business Tuesday.  Where when a business owner asks for a new feature, you smile and say, it mostly already does that because it's just a logical extension of the relational design.  Where you can look at your database, and immediately get a sense of what the data means.

Contrast that with bad code.  Where any step taken is fraught with peril.  You can't change anything for fear of the whole system collapsing like a house of cards, worse even, you daren't even step heavily around it, in case the table shakes and the whole thing just collapses apparently of it's own accord.  Where implementing anything requires long heavy test cycles that seem to take forever, and where business is always angry that what they are being given is so full of bugs and problems that never seem to go away all the way.

I want to take a moment now to look at why this is.  What makes one codebase such a pleasure to work with, and another such a horrible pain.  Let's think about the human psyche, where we came from and who we are for the world, let's get metaphysical for a moment.  When you open up a good novel and dive in, what is it that is engaging?  When you look at a page of mathematics, unless you have a PhD in Math, why does it occurs as noise?  All this points to the first trait of a good codebase:

It tells a story.

Open up the sourcecode to your project, and see, what is the story it's telling say?  Can you tell?  The statistics suggest that the average developer spends 10x as much time reading code in a day than they spent writing it.  If your codebase isn't telling a compelling story, and doing so in a way a good book does, you've probably got a pile of frustrated coders on your hands.  When you open up a class or script and your brain fires off in horror "Oh my god, who wrote this?!", or "Oh my god, what does this even do?!", you know you might have a problem.

I'm not an English major, my wife holds that distinction in our family, but I know that a good story has compelling characters, solid plot, a place it starts, a clear direction, and a place it ends up.  The best stories might be surprising or insightful or emotional; but they are all engaging and compelling.

If your codebase doesn't tell a compelling story, there's a pretty good chance that your product doesn't either, and that your company doesn't either.  Conway's law says that companies write applications with the same structure as their organization.  If your code base looks a certain way, it might be an indication of your organizational culture, and, that might also be something you want to look at.

If you look at the Clean Code book by Robert C Martin, you'll see that code that has clearly distinguished levels of abstract will have a mixture of fuction or method types.  There will be methods that talk almost in English: return userDataAccess.fetchUsers() map (getPersonalData andThen tokenize).  A non-developer can, with just a little explanation of what what "map" does, fully understand what this accomplishes!  This function tells a story.  If you don't have functions or methods in your code that looks like this, this is a strong symptom of failing to have appropriate levels of abstraction.  It also means that you like have a great deal of copy/paste in your system going on and that refactoring anything is going to result in you finding places where the same operation is performed with slight variations that were never normalize.

Okay, so you've realized that what you have on your hands is a dry math paper, and not a novel.  What can you do about it?

Cure for code that doesn't tell a story: normalize the heck out of it.

Go through, start seeing where there are services present.  like userDataAccess.fetchUsers.  If what happens all over the place is raw queries to your datastore, you'll benefit from normalizing this into a service component.  Normalize with a passion, normalize with vigor.  You'll see the size of your codebase shrinking and shrinking.  You'll start to see the story of your system emerge.  And, you'll start to see productivity rise.  If you didn't have any before, you'll be able to write tests now.  If you had tests before, you'll start seeing them simplify greatly.  You'll start to see developers actually want to write then because it enables them to develop new features faster.  Incidently, this is also one antidote for tangling and scattering, another common problem.

That's enough pontificating for one morning, I'll come back and write a part two, where, I'll talk more about tangling and scattering, and also abstraction versus simplification.

Tuesday, April 10, 2018

Technology... The Madness of MySQL

Back working for a start-up again.  This brings all the pluses and minuses as per usual.  Crazy hours sometimes, fun projects, more control, less process.

It also brings something else.  MySQL.

Anyone who knows me, knows just how much visceral hate I have for this "database".  In the last 48 hours, I've learnt two more fun facts to add to the list of solid reasons why to skip MySQL in favor of better solutions:

1. Nullable false leads to an implicit default.

We use Liquibase to version our database.  It's good.  But, it means that generally I find myself writing schema in XML format, not in SQL format.  Creating a new column on a table is easy enough, and there's even a handy XML block for constraints.  I copied the definition from another column definition from before, which included a nullable=false constraint, which on first glance, seemed appropriate enough.

I run the update, and wonder what's taking so long...

Turns out if you specify nullable="false", MySQL will use a default value, all databases would, that's perfectly sensible.  The thing that's not sensible is is that in this case, I didn't specify one!  So MySQL, instead of throwing an error, telling my schema change is invalid and missing something, just goes ahead and implies a default.  Stop implying things MySQL, you're guessing what I mean, not doing what I say.  This is generally a bad thing for software to do.  And it implied I meant that I did in fact want a default value and that default value should be 0 for a bigint column.  Not a bad assumption necessarily, but, an assumption nonetheless; and to assume, makes an ass out of u and me, but in this case, mostly just me.  This of course then expended a lot of CPU resources to apply as this particular table has a great many rows.  No problem, kill the session and remove the constraint.

Which leads us swiftly to number 2...

2. Adding a column to a table in MySQL requires a full table update, even in InnoDB.

o_o. o_O.  O_O.

I think in 2018, every single other database engine handles this correctly.  It's a meta-data change.  Not so in MySQL.  Even with no default value, the database engine insists on rebuilding the entire database store.

So if you're an enterprise with a large(ish) table, and you need a new column.  Downtime will be required.  Downtime in 2018 is not what users have come to expect.  Downtime is never acceptable to users.

MySQL.  I didn't think I would discover new reasons to despise you for being a pile of poorly implemented not really ACID compliant unhelpful non-SQL standard compliant database.  I was unpleasantly surprised.

Tuesday, August 23, 2016

Law of Demeter and perhaps something more strict that isn't quite

Looking at a piece of code today and thinking about the consequences of discovering things about objects a method is passed.

If I have a method that is responsible for performing a mapping, let's call it from type A to type B so

A -> B

or to use a more Scala-ish syntax:

f(A): B

then I might argue that the method f, should not attempt to enhance in any way the object of type A.  If the properties that require the construction of B are not immediately present in A as per the law of Demeter, or if we recast the system slightly such that A may represent a composite, the set of objects represented by A, then there should be an intermediary method that gathers the required information for the mapping operation and creates an enhanced context.  This would be a separation of concerns, one being the enhancement of the object of type A, and the other the operation of generating a B.

A -> B then expands to A -> C -> B

where C is the set of information required to construct B, so we might get two methods:

constructB(C): B so that our type arrow is C -> B

and enhance(A): C so that the type arrow is A -> C

This means that should somebody construct logic that does some kind of side-effecting operation in enhance(), that operation can be isolated from the mapping.

This means that when we look at a mapping function, we can say it should not contain ANY additional type arrows within.  It should only access and map properties from the composite C to create an object of type B.  Any additional mapping or derivation that occurs within, or the processing of a type arrow breaks separation of concerns.

This feels pretty strict, but I'm looking at code today where if that rule had been followed, a very nasty side-effecting piece of code that was buried several levels deep in an abstraction would never have been permitted!

Thursday, August 20, 2015

Java 8 - Exploring FunctionalInterface

A few days ago I posted a highly frustrated post on Facebook about Java having Lambdas, but not having any Try<> mechanism meaning that in most cases, you're left declaring a try block inside a lambda.  Turns out there's a different way to approach this that gives a different resolution.

Say you're a Scala person like me, and have discovered that checked exceptions actually are more of a pain than they're worth, and believe they actually break SOLID engineering principles, particularly the part about encapsulation.  When first exploring Java 8, it seemed to me that the lack of a Try<> type was pretty bad news.  I still think Try<> would be useful, but there is at least a way to get around having a very ugly try/catch block inside a lambda.

So Java, I take it back - you've done something weird, but cool.  Turns out you don't need to worry about Function<> specifically; any interface that declares only a single method is functional, and will be eligible for syntax magic.  (Though I don't like magic, it's at least traceable magic).  It's perfectly valid to declare:

@FunctionalInterface
public interface ExceptionalFunction<A, B> {
    B f(A a) throws Exception;
    default B apply(A a) {
        try { return f(a); }
        catch (Exception e) { throw new RuntimeException(e); }
    }
}

and then your call that uses thusly:

public <T> T withMyThing(ExceptionalFunction<MyThing, T> f) {
  f(fetchMyThing);
}

and then

withMyThing(x -> isAwesome(x));

or because apparently you can:

withMyThing(this::isAwesome(x));

This means that if isAwesome() throws a checked exception, our wrapper will capture it and it will be suppressed down to a runtime exception.  I'm not going to debate the merits of that here, only to say that here be dragons, and that probably breaks expected behavior in many situations, but, at the same time can be pretty useful too, particularly in Test Suites, where exceptional behavior is either being explicitly elicited, or explicitly checked against.  Though I supposed that if you're eliciting it, getting back a RuntimeException containing the expected might break the test case... like I said, here be dragons.

Though we might have noticed that now apparently interfaces in Java can have method bodies... Uh wut?  This is doesn't seem any different to me than having say:

@FunctionalInterface
public abstract class ExceptionalFunction<A, B> {
  public abstract B f(A a) throws Exception;
  public <B> B apply(A a) {
    try { return f(a); }
    catch (Exception e) { throw new RuntimeException(e); }
  }
}

I suppose it does have the syntactic implication, that the function you're declaring could be something other than public, which in a function interface context wouldn't make sense, but perhaps that should be a compiler error rather than changing what an "Interface" fundamentally means in Java?

So be here yea forewarned: Interfaces in Java 8 may have method bodies!!