hello mac

解决方案:
1)进入刚安装的Android Studio目录下的bin目录。找到idea.properties文件,用文本编辑器打开。
2)在idea.properties文件末尾添加一行: disable.android.first.run=true ,然后保存文件。
3)关闭Android Studio后重新启动,便可进入界面。
可以成功解决。

 

 

export PS1=”\\u@\\h:\\w$ “

备份脚本

114 # IP=”$(echo $SSH_CONNECTION | cut -d ” ” -f 1)”
115 # HOSTNAME=$(hostname)
116 # NOW=$(date +”%e %b %Y, %a %r”)
117 # if [ "$IP" != "" ] ; then /home/eggfly/git.oschina.net/wechat-push/wechat.py lihaohua “” “” “0″ “登录时bash history信息:
118 # tail -c 500 ~/.bash_history” && \
119 # /home/eggfly/git.oschina.net/wechat-push/wechat.py lihaohua “” “” “0″ “有人从$IP登录到$HOSTNAME, last信息:
120 # last | head -c 450” ; fi

a spark newbie

scala:

lazy val, implicit,

MapPartitionsRDD 在map的时候如果更换了key,就会比较耗时,需要长时间的shuffle操作,甚至比sortByKey都耗时。

 

windows使用hadoop+spark:

https://github.com/karthikj1/Hadoop-2.7.1-Windows-64-binaries

http://blog.csdn.net/u013226462/article/details/48848689

16/09/23 23:09:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
16/09/23 23:09:09 INFO security.UserGroupInformation: Can’t login from keytab, try to login from ticket cache
16/09/23 23:09:09 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm1
16/09/23 23:09:10 INFO yarn.Client: Requesting a new application from cluster with 523 NodeManagers
16/09/23 23:09:10 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/09/23 23:09:10 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
Exception in thread “main” java.lang.IllegalArgumentException: Required AM memory (8192+2000 MB) is above the max threshold (8192 MB) of this cluster! Please increase the value of ‘yarn.scheduler.maximum-allocation-mb’.
at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:292)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:141)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1085)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1145)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:749)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

 

 

ss

 

Job aborted due to stage failure: Serialized task 1073:0 was 205781866 bytes, which exceeds max allowed: spark.akka.frameSize (134217728 bytes) – reserved (204800 bytes). Consider increasing spark.akka.frameSize or using broadcast variables for large values.

 

 

   1、spark.driver.maxResultSize 8g    driver获得处理结果的最大内存数,由于我要处理大矩阵,所以这个参数还是不得不改的

     2、spark.yarn.executor.memoryOverhead  2048    跑了一段时间后发现很多executor堆外内存占用过大,采用这个参数后稍好

     3、spark.shuffle.blockTransferService nio     spark 1.2.0以后shuffle service改为了netty,这个很扯淡,我改为nio后堆外内存较少了很多,同时处理时间提示提升了一倍

.bash_alias

 

2016 aws迁移笔记

  • 新邮箱申请+电话验证+绑定信用卡
  • 开启实例(不开每分钟的详细监控,因为会收费)
  • 绑定elastic ip(实例运行中算免费)
  • 设好dns解析
  • 设置好报警+邮件提醒组
  • pem转成putty的公钥私钥,用户名ubuntu登录(看情况有风险要不要设置密码登录)
  • 安全组开启pptp的tcp入站端口1723和ping的入站icmp协议,http的80端口,默认有sshd的22端口
  • sudo apt-get update & dist-upgrade
  • 设置vpn和shadowsocks
  • web服务等等
  • 设置一年后提醒的日历