How TensorFlow Python APIs are generated

Posted on 2020-04-26 Edited on 2022-12-27 In TensorFlow , core

TensorFlow Python APIs are automatically generated by Pybind11 and some utility scripts. This blog introduces how these Python APIs are generated.

Notes for TensorFlow Dev

Posted on 2020-01-04 Edited on 2022-12-27 In TensorFlow

Turn on VLOG

// Otherwise, set TF_CPP_MIN_VLOG_LEVEL environment to update minimum log level
// of VLOG, or TF_CPP_VMODULE to set the minimum log level for individual
// translation units.
#define VLOG_IS_ON(lvl)                                                     \
  (([](int level, const char* fname) {                                      \
    static const bool vmodule_activated =                                   \
        ::tensorflow::internal::LogMessage::VmoduleActivated(fname, level); \
    return vmodule_activated;                                               \
  })(lvl, __FILE__))

日不落 -- 阿拉斯加之旅

Posted on 2019-05-31 In Tourist

第1天：三藩经西雅图中转飞Anchorage, AK，然后在机场取车开到Seward，AK（2个半小时）；
第2天：爬Kenai Fjords National Park的Exit Glacier（9:30am - 15:50pm），推荐穿登山鞋和雪地防滑铰链；
第3天：乘坐Kenai Fjords Tour的游船观看Kenai Fjords National Park里的海洋动物和冰川；推荐带一个望远镜，准备一点晕船药，多穿点衣服防寒；Tour是早上9点开始check in，晚上6:30pm结束；结束后开车回Anchorage住宿；
第4天：基本上一整天都在开车前往Denali National Park and Preserve，路上有几个观景点可以看雪山；晚上住在旁边的Denali Grizzly Bear Resort；
第5天：参加Denali National Park and Preserve的bus tour（9:30am - 6:30pm），主要是欣赏公园里的自然风光和动物，推荐带一个望远镜；
第6天：参观Aurora Ice Museum和泡温泉（两个地方在一起，Chena Hot Springs Road, Fairbanks, AK）。然后参观Santa Claus House；
第7天：开到Gakona, AK 休息，住在Gakona Lodge & Trading Post（Mile 2 Tok Cutoff, Gakona, AK 99586；
第8天：Matanuska Glacier Hiking;
Read more »

Install the standalone Spark Cluster

Posted on 2018-07-16 In Spark

Checklist

Ensure all nodes can resolve each other by hostnames/ips
Enable SSH with no need of password
Install JDK on each node

Export JAVA_HOME and SPARK_HOME in `~/.bashrc` on each node

export JAVA_HOME=/home/feihu/jdk1.8.0_171
export PATH="$JAVA_HOME/bin:$PATH"
export SPARK_HOME=/home/feihu/spark-2.3.1-bin-hadoop2.7
export PATH="$SPARK_HOME/bin:$PATH"

Configure `spark-defaults.conf`

1	spark.master spark://codait-gpu2:7077

Add data nodes to slaves

Connect Spark with Cloudera Yarn Cluster

Posted on 2018-04-14 In Spark

Cloudera automatically installs Spark cluster, but it is not easy to update Spark version by Cloudera. There is another
solution for users to run the latest version of Spark on Cloudera Yarn cluster.

set up the environment to make the configuration files of Hadoop available to the package

1	export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark/conf/yarn-conf/

Start the spark-shell under the Spark package

1	./spark-shell --master yarn --deploy-mode client --num-executors 10 --driver-memory 12g --executor-memory 10g --executor-cores 24

Zeppelin Connects Hive

Posted on 2018-01-03 In Big Data , Hadoop

Make sure that Hive can be accessed remotely using HiveServer2

* `bash-4.2$ beeline`
* `beeline> !connect jdbc:hive2://svr-A3-A-U2:10000 hive hive`
* if unable to access, check the following configuration in `HIVE_HOME/conf/hive-site.xml`
    1
2
3
4
5
6
7
8
9
10
<property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
        <description>ername to use against metastoredatabase</description>
</property>
<property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456</value>
        <description>password to use against metastoredatabase</description>
</property>

Move configuration file

$HIVE_HOME/conf/hive-site.xml to $ZEPPELIN_HOME/conf/hive-site.xml

Set up PySpark with Keras

Posted on 2017-09-10 In Machine Learning System , Deep Learning , Big Data , TensorFlow , Keras

Install libraries

1
2
3

pip install -U matplotlib numpy pandas scipy jupyter ipython scikit-learn scikit-image openslide-python`
pip install tensorflow
pip install pyspark py4j

Enable Jupyter to run PySpark by adding the following to ~/.bash_profile

1 2	export PYSPARK_DRIVER_PYTHON=jupyter export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

Git Notes

Posted on 2017-07-26 Edited on 2017-07-30 In Git

Merge a remote pull to the current branch

Add the following script to ~/.gitconfig

1 2	[alias] pr = "!f() { git fetch ${2:-upstream} pull/$1/head:pr/$1 && git checkout pr/$1; }; f"

Use git pr PULL_NUM to download the pull, for example, use git pr 577 to checkout pr 577 into a branch called pr/577
Merge the above branch to the target branch, for example git merge pr/577

SystemML Workflow

Posted on 2017-07-10 In Machine Learning System , Big Data

Create DML script

Use org.apache.sysml.api.mlcontex.Script class to create DML script. The in() and out() functions are utilized to map the input and output values

Handling DML

/org/apache/sysml/api/mlcontext/ScriptExecutor.class

public MLResults execute(Script script) {
  this.setup(script);
  this.parseScript();
  this.liveVariableAnalysis();
  this.validateScript();
  this.constructHops();
  this.rewriteHops();
  this.rewritePersistentReadsAndWrites();
  this.constructLops();
  this.generateRuntimeProgram();
  this.showExplanation();
  this.globalDataFlowOptimization();
  this.countCompiledMRJobsAndSparkInstructions();
  this.initializeCachingAndScratchSpace();
  this.cleanupRuntimeProgram();

  try {
    this.createAndInitializeExecutionContext();
    this.executeRuntimeProgram();
  } finally {
    this.cleanupAfterExecution();
  }

  MLResults mlResults = new MLResults(script);
  script.setResults(mlResults);
  return mlResults;
}

More time are needed to understand
- constructHops()
- rewriteHops()
- countCompiledMRJobsAndSparkInstructions()

CS231n_Notes_1.1: Image Classification

Posted on 2017-07-08 In Deep Learning

Image Classification Problem is the task of assigning an input image one label from a fixed set of categories. Many other seemingly distinct Computer Vision tasks (such as ojbject detection, segmentation) can be reduced to image classification.
Challenges
- Viewpoint variation: A single instance of an object can be oriented in many ways with respect to the camera
- Scale variation: Visual classes often exhibit variation in their size (size in the real world, not only in terms of their extent in the image)
- Deformation: Many objects of interest are not rigid bodies and can be deformed in extreme ways.
- Occlusion: The objects of interest can be occluded. Sometimes only a small portion of an object (as little as few pixels) could be visible.
- Illumination conditions: The effects of illumination are drastic on the pixel level.
- Background clutter: The objects of interest may blend into their environment, making them hard to identify.
- Intra-class variation: The classes of interest can often be relatively broad, such as chair. There are many different types of these objects, each with their own appearance.

Turn on VLOG

Checklist

Export JAVA_HOME and SPARK_HOME in ~/.bashrc on each node

Configure spark-defaults.conf

Make sure that Hive can be accessed remotely using HiveServer2

Move configuration file

Install libraries

Enable Jupyter to run PySpark by adding the following to ~/.bash_profile

Merge a remote pull to the current branch

Create DML script

Handling DML

Export JAVA_HOME and SPARK_HOME in `~/.bashrc` on each node

Configure `spark-defaults.conf`