2013-12-23

Null in FME 2014: Handling Null with Python / Tcl

Important, 2014-01-29: I heard that Safe is planning to change the implementation of Python API fmeobjects.FMEFeature.getAttribute() method, so that it returns an empty string when specified attribute stores <null>. Currently - FME 2014 build 14234 - it returns "None" in that case.
After confirming the change, I will revise related descriptions (underlined) in this article.
-----
2014-02-14: I noticed that the method in FME 2014 SP1 Beta (build 14255) returns an empty string for <null>. The change of implementation seems to be done for SP1.
-----
2014-02-25: The change about FME Objects Python API has been announced. I revised related descriptions in this article (underlined).

(FME 2014 Beta build 14223)

These Python API methods and Tcl procedures have been added in FME 2014 to handle <null> attributes appropriately.
New Python Methods   FMEFeature.getAttributeNullMissingAndType(attrName)
FMEFeature.setAttributeNullWithType(attrName, attrType)
New Tcl ProceduresFME_IsAttributeNull attrName
FME_SetAttributeNull attrName

I tried them to learn those functions and usage, this article summarizes the result. If there are wrong descriptions, please point them out.
Official descriptions on those methods / procedures can be seen in
- FME Objects Python API: [FME_HOME]/fmeobjects/python/apidoc/index.html
- FME pre-defined Tcl procedures: TclCaller transformer help documentation

1. Determine if an attribute contains <null>
Python: FMEFeature.getAttributeNullMissingAndType Method

FME 2014 SP1+ (build 14252 or later):
FMEFeature.getAttribute method returns an empty string when specified attribute is <null>.
If it's necessary to distinguish <null> from empty string in the script, we have to use the getAttributeNullMissingAndType method (added in FME 2014).

FME 2014 without SP*:
FMEFeature.getAttribute method returns None when specified attribute is <null> or <missing>.
If it's necessary to distinguish <null> from <missing> in the script, we have to use the getAttributeNullMissingAndType method (added in FME 2014).

The method returns a tuple consisting of 3 elements - null flag (boolean), missing flag (boolean) and data type identifier (int). The null flag indicates whether the attribute contains <null>, and the missing flag indicates whether the attribute is <missing>.
For example, when a feature has two attributes
  attrEmpty = <empty string>
  attrNull = <null>
a PythonCaller with this script replaces them to some string value indicating the original status.
-----
import fmeobjects
def testNullMissingEmpty(feature):
    for name in ['attrEmpty', 'attrNull', 'attrMissing']:
        value = ''
        isNull, isMissing, type = feature.getAttributeNullMissingAndType(name)
        if isMissing:
            value = 'this was missing'
        elif isNull:
            value = 'this was null'
        else:
            value = feature.getAttribute(name)
            if len(str(value)) < 1:
                value = 'this was empty'
        feature.setAttribute(name, value)
-----

Tcl: FME_IsAttributeNull Procedure
A pre-defined Tcl procedure named FME_IsAttributeNull has been added in FME 2014.
A TclCaller with this script does the same job as the PythonCaller above.
-----
proc testNullMissingEmpty {} {
  foreach name {"attrEmpty" "attrNull" "attrMissing"} {
    set value {}
    if {[FME_AttributeExists $name] == 0} {
      set value "this was missing"
    } elseif {[FME_IsAttributeNull $name]} {
      set value "this was null"
    } else {
      set value [FME_GetAttribute $name]
      if {[string length $value] < 1} {
        set value "this was empty"
      }
    }
    FME_SetAttribute $name $value
  }
}
-----









2. Set <null> to attributes
Python: FMEFeature.setAttributeNullWithType Method
If we need to set <null> to an attribute, the FMEFeature.setAttributeNullWithType method (added in FME 2014) can be used.
A PythonCaller with this script example replaces "toNull" with <null>, replaces "toEmpty" with <empty string>, and removes attributes containing "toMissing".
-----
import fmeobjects
def mapToNullMissingEmpty(feature):
    for name in feature.getAllAttributeNames():
        isNull, isMissing, type = feature.getAttributeNullMissingAndType(name)
        if not isNull and not isMissing:
            value = str(feature.getAttribute(name))
            if value == 'toNull':
                feature.setAttributeNullWithType(name, type)
            elif value == 'toEmpty':
                feature.setAttribute(name, '')
            elif value == 'toMissing':
                feature.removeAttribute(name)
-----

Tcl: FME_SetAttributeNull procedure
A pre-defined Tcl procedure named FME_SetAttributeNull has been added in FME 2014.
A TclCaller with this script does the same job as the PythonCaller above.
-----
proc mapToNullMissingEmpty {} {
  foreach name [FME_AttributeNames] {
    set value [FME_GetAttribute $name]
    if {[string compare $value "toNull"] == 0} {
      FME_SetAttributeNull $name
    } elseif {[string compare $value "toEmpty"] == 0} {
      FME_SetAttribute $name {}
    } elseif {[string compare $value "toMissing"] == 0} {
      FME_UnsetAttributes $name
    }
  }
}
-----









3. Handle <null> elements in a list attribute
Python FMEFeature.getAttribute method returns a string list when an existing list attribute name (e.g. "_list{}") is specified to its argument. It's very convenient functionality to handle list attributes easily; I have often used it.
But now (in FME 2014+), we should be aware that every <null> element will be interpreted to empty string in that case.

For example, "copyList1" function in the following script copies
_src{} = A,<null>,B,<null>,C
to
_dest{} = A,,B,,C
-----
import fmeobjects
def copyList1(feature):
    src = feature.getAttribute('_src{}')
    feature.setAttribute('_dest{}', src)
-----

If <null> elements in the source list have to be treated as <null> in the destination list too, the script should be like this. Assume there is no <missing> element in the source list.
-----
import fmeobjects
def copyList2(feature):
    i = 0
    while True:
        isNull, isMissing, type = feature.getAttributeNullMissingAndType('_src{%d}' % i)
        if isMissing:
            break
        if isNull:
            feature.setAttributeNullWithType('_dest{%d}' % i, type)
        else:
            feature.setAttribute('_dest{%d}' % i, feature.getAttribute('_src{%d}' % i))
        i += 1
-----

I think the difference between the results of "copyList1" and "copyList2" should be memorized.

Alternatively, if the number of list elements has been stored as an attribute (e.g. _element_count) beforehand, this script makes the same result. The number of list elements can be stored with the ListElementCounter transformer.
2014-01-29: If implementation of getAttribute changed, this script would not be able to determine whether the value is <null>. It would bring unexpected result.
I will remove the description and this script after confirming the change.
-----
2014-02-25: Removed.
-----
import fmeobjects
def copyList2(feature):
    num = int(feature.getAttribute('_element_count'))
    for i in range(num):
        value = feature.getAttribute('_src{%d}' % i)
        if value == None:
            type = feature.getAttributeType('_src{%d}' % i)
            feature.setAttributeNullWithType('_dest{%d}' % i, type)
        else:
            feature.setAttribute('_dest{%d}' % i, value)
-----

The Tcl script performing the same job could be simpler a little.
-----
proc copyList2 {} {
  for {set i 0} {[FME_AttributeExists "_src{$i}"]} {incr i} {
    if {[FME_IsAttributeNull "_src{$i}"]} {
      FME_SetAttributeNull "_dest{$i}"
    } else {
      FME_SetAttribute "_dest{$i}" [FME_GetAttribute "_src{$i}"]
    }
  }
}
-----

No comments:

Post a Comment