This features allows users to create an avatar of themselves by esentially capturing a short selfie video. During the process, the user is guided and receives feedback to make sure that we get all the data we need. The captured data is then uploaded to our cloud for processing. Once the processing has finished, the resulting avatar can be downloaded and used with our mobile SDKs. Since users only capture their face, the body is purely virtual and can be adjusted. More information on this can be found in the respective sections. The resulting avatar can be used with our mix & match functionality in 2D. The following steps outline the process of creating an avatar this way:

  • Instruct the user on the capturing process

  • Perform the capturing and guide the user

  • Upload the data to our Content Service

  • Download the resulting avatar

  • Allow the user to adjust their avatar

The following short visual illustrates the capturing process.

Capturing & User Guidance

The logic for capturing is provided by the RRTrueDepthCaptureView class. To perform a head avatar capturing, the required steps involve showing an instance of the RRTrueDepthCaptureView and call its startCapturingWithStorageDirectory method. To achieve a good user experience, it will also be necessary to implement the RRTrueDepthCaptureViewDelegate protocol and provide it as the view’s delegate. This allows you to recieve feedback from the capturing component and instruct the user accordingly.

To make sure that the capturing works out fine and users know what to do, you should show instructions before starting the process. Best practice is to display a short video clip that explains the capturing process. Additionally, showing a bullet point list that mentions all the important points is adviseable. The list should cover the following points:

  • Lang hair should be put in front of the shoulders

  • Sit upright and face the sun/light

  • Turn your phone volume up to hear audio instructions

  • Remove glasses

The following video gives an example of what the capturing process could look like:

The following code snippet shows a simple example of how the updatedCaptureWarning and updatedCaptureInstructiondelegate method can be used to give the user feedback. You should provide your users with the instructions and warnings the delegate delivers to guide them. During capturing, you can either use on-screen instructions or give the user audio feedback.

func trueDepthCaptureView(_ captureView: RRTrueDepthCaptureView, updatedCaptureWarning warning: RRTrueDepthCaptureViewCaptureWarning) {
  var text : String? = nil
  switch warning {
  case .noFaceDetected:
    text = "No face detected"
  default:
    text = nil
  }
  
  if text != nil && self.lastSpeechFeedbackText != text {
    let utterance = AVSpeechUtterance(string: text!)
    utterance.voice = AVSpeechSynthesisVoice(language: "en-EN")
    self.speechSynthesizer.speak(utterance)
    self.lastSpeechFeedbackText = text
  }
  
  if self.mainView.headCapturingView?.isCapturing == false { // No overlay messages during capturing since the user should anyway not look at the screen
    self.showOverlayMessage(text)
  }
}

func trueDepthCaptureView(_ captureView: RRTrueDepthCaptureView, updatedCaptureInstruction instruction: RRTrueDepthCaptureViewCaptureInstruction) {
  var text : String? = nil
  switch instruction {
  case .centerToTheLeft, .centerToTheRight:
    text = "Center horizontally"
  case .moveDown:
    text = "Move down"
  case .moveUp:
    text = "Move up"
  default:
    text = nil
  }
  
  if text != nil && self.lastSpeechFeedbackText != text {
    let utterance = AVSpeechUtterance(string: text!)
    utterance.voice = AVSpeechSynthesisVoice(language: "en-EN")
    self.speechSynthesizer.speak(utterance)
    self.lastSpeechFeedbackText = text
  }
}
SWIFT

As soon as the capturing has finished, the RRTrueDepthCaptureViewDelegate method trueDepthCaptureViewFinishedCapturingwill be triggered. Use this to signal the user the that the capturing is done. The trueDepthCaptureViewFinishedCapturing also provides a capturing quality summary, that includes several boolean properties indicating possible quality warnings about the finished session. Have a look at the API docs for the possible warning types. If there are warnings, they should be presented to the user and you should allow the user to restart the capturing if he/she wants to give it another try.

Data Transfer

After you received the delegate’s trueDepthCaptureViewFinishedCapturing callback, some processing will happen internally within the RRTrueDepthCaptureView class. After the processing has finished, the delegate’s trueDepthCaptureViewDataReadymethod will be triggered. This callback tells you that the captured data is ready for upload. From now on you can access the captured data and transfer it to our PICTOFiT Content Service. The data consists of several captured frames where each frame is represented by an instance of the RRTrueDepthKeyframe class.

The following code snippet shows how you can access the captured data. The file names used in this sample already show the naming you must follow when uploading the files to our Pictofit Content Service so that we can properly process it .

func uploadData(headCapturingView: RRTrueDepthCaptureView) {
  var fileNames: [String] = []
  var fileData: [Data] = []

  let numKeyframes = headCapturingView.capturedFramesCount
  
  for keyframeId in 0..<numKeyframes {
    let keyframe = headCapturingView.getCapturedKeyframe(forFrameID: keyframeId)!
    fileNames.append("color_\(keyframeId)")
    fileData.append(keyframe.colorData)
                     
    fileNames.append("depth_\(keyframeId)")
    fileData.append(keyframe.depthData)
    
    fileNames.append("metadata_\(keyframeId)")
    fileData.append(keyframe.metadataToJSON())
  }
  
  uploadFiles(fileNames, fileData)
}
SWIFT

The following list summarises again what you need to upload so that we can generate an avatar for you:

  • One color image file per captured keyframe named color_<keyframe index> with Content Service file type ADDITIONAL_VIEW

  • One depth image file per captured keyframe named depth_<keyframe index> with Content Service file type DEPTH

  • One metadata file per captured keyframe named metadata_<keyframe index> with Content Service file type MATRIX

Please check the Content Service API documentation on how to upload data.

Additionally, it is also required to upload a metadata JSON dictionary to the Content Service. This metadata must be uploaded as the product entity’s metadata. To create this JSON metadata, use the RRHeadAvatar3DProductMetadata class that is provided by the Pictofit iOS SDK. The following code snippet shows how to use it:

func getProductMetadata(avatarName: String, gender: RRGender, bodyHeight: Int) -> String {
  let metadata = RRHeadAvatar3DProductMetadata()
  metadata.avatarName = avatarName
  metadata.gender = gender
  metadata.bodyHeight = bodyHeight
  
  return metadata.getJsonString()
}
SWIFT

Here’s a detailed description of the JSON values and their format:

  • avatarName: An arbitrary display name for the captured avatar as a string

  • gender: The avatar gender as a string, which will define the template body model that will be used. Currently supported values: "female" and "male"

  • bodyHeight: The full body height of the captured user in centimeters. Data type must be integer

  • platform: The platform of the capturing device as a string

  • deviceName: The device identifier of the capturing device as a string

  • osVersion: The OS version number of the capturing device as a string

Once the processing has finished, you can download the resulting avatar and present the user the Interactive Avatar Configurator as the next step.